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USER  CENTERED  SYSTEM  DESIGN: 

PAPERS  FOR  THE  CHI  '83  CONFERENCE 
HUMAN  FACTORS  IN  COMPUTER  SYSTEMS 

\ ,  Abstract 

This  report  includes  tour  papers  by  the  UCSD  Project  on  Human-Computer  Interfaces 
presented  at  the  1983  Conference  on  Human  Factors  in  Computer  Systems  (Boston,  December 
1983).  The  first  paper.  Evaluation  and  Analysts  of  User's  Activity  Organization  (Bannon, 
Cypher,  Greenspan,  and  Monty),  analyzes  the  activities  performed  by  users  of  computer  sys¬ 
tems.  These  activities  show  complex  patterns  of  interleaved  activities.  This  paper  develops  a 
framework  for  discussing  the  characteristics  of  these  activities  in  terms  of  Activity  Structures, 
and  provides  a  number  of  conceptual  guidelines  for  developing  an  interface  which  supports 
activity  coordination. 

The  second  paper,  A  Proposal  for  User  Centered  System  Documentation  (O’Malley, 
Smolensky,  Bannon,  Conway,  Graham,  Sokolov,  and  Monty),  outlines  a  set  of  proposals  for 
the  development  of  system  documentation  based  on  an  analysts  of  user  needs.  The  paper  out¬ 
lines  three  specific  proposals:  a  quick-reference  facility,  a  command-line  database,  and  a  facil¬ 
ity  for  full  explanation  and  instruction.  The  paper  suggests  a  way  of  combining  these  facilities 
into  an  integrated,  structured  manual,  offering  more  effective  user  support  than  is  currently 
provided. 

The  third  paper.  Questionnaires  as  a  Software  Evaluation  Tool  (Root  and  Draper),  reports 
on  a  study  investigating  the  strengths  and  weaknesses  of  questionnaires  as  software  evaluation 
tools.  The  results  suggest  that  it  is  important  to  distinguish  between  questions  addressed  to 
existing  features  that  the  users  already  have  experienced  and  questions  about  proposed  new 
features,  no  matter  how  specific,  that  they  cannot  have  had  experience  with.  The  most  suc¬ 
cessful  question  type  is  the  checklist  which  will  then  give  a  list  of  areas  needing  attention. 
However,  this  primarily  is  useful  for  evaluating  the  designer's  sins  of  commission. 

7  The  fourth  paper.  Design  Principles  for  Human-Computer  Interfaces  (Nonnan),  discusses 
some  of  the  properties  that  useful  principles  should  have  and  presents  examples  of  a  tradeoff 
anaiysu^Any  single  design  technique  is  apt  to  have  its  virtues  and  deficiencies  along  different 
dimensions.  Tradeoff  analysis  provides  a  quantitative  method  of  assessing  tradeoff  relations 
for  two  attributes  by  first  determining  the  User  Satisfaction  function  for  each,  then  showing 
how  one  trades  off  against  the  other.  The  analysis  is  used  to  examine  the  tradeoff  of  informa¬ 
tion  versus  time,  and  also  of  editor  workspace  versus  menu  size.  Tradeoffs  involving  com¬ 
mand  languages  versus  menu-based  systems,  choices  of  names,  and  handheld  computers  versus 
work  stations  are  examined  briefly. 
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EVALUATION  AND  ANALYSIS  OF  USER’S 
ACTIVITY  ORGANIZATION 


Liam  Bannon,  Allen  Cypher,  Steven  Greenspan,  and  Melissa  L.  Monty 
Institute  for  Cognitive  Science 
University  of  California,  San  Diego 


Abstract 

Our  analyses  of  the  activities  performed  by  users 
of  computer  systems  show  complex  patterns  of 
interleaved  activities.  Current  butnaa  •  computer 
interfaces  provide  little  support  for  tbe  kinds  of 
problems  users  encounter  when  attempting  to 
accomplish  several  different  tasks  in  a  single  ses¬ 
sion.  In  (bis  paper  we  develop  a  framework  for 
discussing  tbe  characteristics  of  activities,  in 
terms  of  activity  structures,  and  provide  a 
number  of  conceptual  guidelines  for  developing 
an  interface  which  supports  activity  coordination. 
Tbe  concept  of  a  workspace  is  introduced  as  a 
unifying  construct  for  reducing  the  mental  work¬ 
load  when  switching  (asks,  and  for  supporting 
contextually-driven  interpretations  of  tbe  users’ 
activity  structures. 


This  document  was  jointly  authored  by  the  members  of  tbe  Activi¬ 
ty  Structures  Research  Croup  listed  above  in  alphabetical  order. 
W«  with  to  thank  Den  Norman,  Bob  Glushko,  and  Jonathan  Gru¬ 
din  far  their  insightful  comments  on  earlier  drafts  of  this  paper 
and  Tam  Erickson  far  hit  related  ideas  and  software  design. 

The  Activity  Structures  Research  Group  ope  rites  under  the 
auspices  of  the  Humao-Machine  Interface  project  (UCSD).  and  is 
comprised  of  computer  scientists  and  psychologist.  This  research 
is  supported  by  Contract  N0001«-79-C-0323,  NR  fi67-*37  with  the 
Personnel  and  Training  Research  Programs  of  the  Office  of  Naval 
Research  and  by  a  grant  from  the  System  Development  Founda¬ 
tion.  Steven  L.  Greenspan  is  supported  on  s  Postdoctoral  Fellow¬ 
ship  by  Grant  PHS  MH  1*268  to  the  Center  for  Human  Informa¬ 
tion  Processing  from  the  National  Institute  of  Meats!  Health. 

Requests  for  reprints  should  be  sent  to  HM1,  Institute  for  Cogni¬ 
tive  Science  C-013;  University  of  California,  Ssn  Diego;  La  Joila, 
California.  92093,  USA. 

To  be  published  in:  Proeeedinfs  of  the  CHI  1983  Conference 
on  Human  Factors  in  Computer  Systems.  Boston;  December, 
1983. 


Introduction 

The  intent  of  Ibis  paper  is  to  give  an  overview  of  the 
work  being  carried  out  by  tbe  Activity  Structures  Research 
Group  at  UCSD.  Our  interests  are  focused  upon  (a) 
developing  a  methodology  for  studying  the  complex  struc¬ 
turing  of  activities  that  often  occurs  to  human-computer 
interaction,  and  (b)  designing  a  human-computer  interface 
that  is  supportive  of  the  user's  conception  of  these  multi¬ 
level  complex  activity  structures.  Our  observations  of  user- 
computer  interactions  ’  strongly  suggest  tbat  tbe  command 
sequences  employed  by  users  are  structured  and  coherent, 
and  that  users  require  computer  systems  which  (a)  provide 
information  to  reorient  tbe  user  when  resuming  tasks  which 
have  been  interrupted,  and  (b)  minimize  tbe  interference 
incurred  when  setring  up  a  new  task. 

We  started  by  analyzing  patterns  of  user-computer 
interaction.  We  did  thia  by  examining  history  lists  of  com¬ 
mands  performed  by  users,  aided  by  a  system  tbat  reminded 
users  to  periodically  annotate  their  lists  with  descriptions  of 
their  intentions  during  the  session.  An  extract  from  one  of 
these  augmented  history  lists  is  provided  in  Figure  1.  Files 
such  as  the  one  shown  were  collected  automatically  over  a 
number  of  sessions  with  each  subject.  From  tbe  commaod 
histories  (oa  (be  numbered  lines)  and  tbe  user’s  comments 
(in  <  >'s),  we  were  able  to  discern  tbe  users'  goals,  sub- 
goals  and  tasks,  and  bow  they  evolved  over  time. 

Tbe  data  reveal  interesting  patterns  of  commands. 
Users  seem  to  engage  in  a  number  of  different  activities  tbat 
can  be  partitioned  into  sets  of  goal-related  tasks.  For 
instance,  in  any  single  session  a  user  may  edit  an  article, 
write  and  debug  a  program,  search  for  an  old  file,  and 
answer  mail.  Often  users  will  work  on  one  task  and  then, 
before  completing  tbat  task,  switch  to  a  second  one.  Tasks 
tend  to  be  nested  within  one  another,  digressions  are  fre¬ 
quent.  When  a  record  is  made  of  these  commands  as  they 
occur  temporally,  as  in  tbe  history  list,  information  on  tbe 
tasks  and  goals  of  tbe  users  is  lost.  An  alternative  organiza¬ 
tion  of  user  commands  which  preserves  their  task  specificity 
would  be  useful  (see  Figure  2).  Separating  the  history  list 
into  such  functionally  distinct  units  is  difficult  due  to  the 
interleaved  activities  of  tbe  user.  In  attacking  this  problem 
from  an  Artificial  Intelligence  perspective.  Huff  A  Lesser 
(1982)  describe  a  system  tbet  utilizes  dues  in  tbe  input  pet- 
terns  to  infer  tbe  user’s  activity  structures.  Our  alternative 
approach  assists  tbe  user  in  explicitly  indicating  tbe  goals 
associated  with  each  command.  Once  tbe  user’s  separate 
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Figure  1.  Example  of  an  annotated  history  list  showing 
complex  interleaving  of  activities. 


tasks  are  known,  it  is  possible  to  treat  each  task  u  a 
separate  workspace.  The  user's  pattern  of  activity  may  then 
be  characterized  as  movements  from  one  workspace  to 
another. 


The  workspace 

Oar  framework  for  discusstag  activity  structures  rests 
heavily  on  the  notion  of  a  ‘ workspace':  aa  environment 
dedicated  to  allowing  cosy  user  manipulation  of  activities  to 
achieve  a  particular  goal  or  set  of  functionally  related  foils. 
Making  the  workspace  the  baric  entity  in  activities  coordina¬ 
tion  baa  important  implications  at  two  levels.  From  the 
uteri'  perspective,  workspaces  most  have  highly  dynamic 
internal  nractu  res  which  can  be  modified  as  users  reformu¬ 
late  their  goals.  From  the  rrxtem't  perspective,  a  workspace 
contains  tools  sad  data  relevant  to  the  users'  goals  and.  in 


Figure  2.  Conceptual  model  illustrating  the  reorganiza¬ 
tion  of  the  command  history  into  funtionaily  distinct 
task  related  command  seta. 

addition,  provides  a  record  of  the  ongoing  activities  or 
processes  resulting  from  applying  a  set  of  tools  to  a  sot  of 
data  structures.  Tho  internal  structure  of  the  workspace 
thus  red ecu  both  the  users’  goals  and  tho  software  tools 
that  tbo  computer  system  can  provide  for  accomplishing 
these  goals. 

Workspaces  combine  the  ideas  behind  functionaily- 
dedned  directories  and  workbenches,  1  bat  they  era  mote 
dynamic  —  they  preserve  information  about  tbo  status  of 
activities,  and  they  are  capable  of  being  partitioned,  recom¬ 
bined.  and  interrelated  by  the  user.  Personal  notes  and 
comments  on  tbo  various  filet  together  with  a  system  for 
displaying  information  about  the  workspace  organization 
can  bo  strategically  located  within  the  workspace,  function¬ 
ing  u  memory  aids  and  descriptors  of  the  contents  or  goals 
defined  by  tba  workspace. 

Users  encountering  a  computer  system  for  tho  first 
time  need  models  for  organizing  their  activities  and  accom¬ 
plishing  tasks.  They  cc>uld  bo  provided  with  a  Dumber  of 
'skeletal'  workspaces  containing  a  sot  of  tools  and  examples 
of  their  use.  Users  would  be  encouraged  to  build  upon  the 
system-provided  frameworks  by  rearranging  and  personaliz¬ 
ing  them  to  suit  their  needs. 


I.  A  workbench  provides  so  environment  which  CCS  rams  a  set  of 
programs  that  are  functieoilly  related  to  a  particular  typo  of  task 
(eg,  'writing  aids*,  or  'piogrammitig'). 


Evaluation  and  Analysis  of  User’s  Activity  Organization 


Activity  Coordination  Issues 

Our  investigations  into  activity  structuring  resulted  in 
the  identification  of  several  classes  of  activity  coordination 
problems.  Based  on  our  analyses  of  users'  annotated  com¬ 
mand  histories,  we  are  suggesting  some  guidelines  for  the 
development  of  a  uniform  interface  system  which  should 
handle  the  coordination  and  execution  of  activities  at  many 
levels. 

Reducing  mental  load  when  switching  tasks 

A  common  problem  with  many  interfaces  is  bow  to 
avoid  situations  where  users  must  struggle  with  the  inter¬ 
face  whenever  they  wish  to  switch  topics.  Suppose  a  user  is 
doing  some  task  but  then  thinks  of  an  idea  relevant  to 
another  task.  In  order  for  the  user  to  be  able  to  make  a 
note  about  an  idea  in  the  relevant  environment,  the  user 
must  stop  the  current  task,  switch  directories,  open  a  new 
file,  and,  if  the  idea  is  still  remembered  after  all  this 
activity,  finally  type  it  in.  Many  users  cope  with  this  prob¬ 
lem  by  keeping  a  pad  of  paper  handy  for  such  occasions. 
Our  proposals  are  designed  to  decrease  the  amount  of  cog¬ 
nitive  overhead  —  the  load  on  working  memory  — 
required  for  the  system  interface  and  to  free  the  user's 
thinking  capacity  for  the  true  tasks  at  hand. 

From  the  user's  point  of  view,  entering  text  should 
require  practically  no  effort.  This  implies  that  naming, 
organizing,  and  locating  the  text  within  the  file  system 
(tasks  intended  to  aid  later  retrieval)  should  not  impede  the 
activity  of  entering  the  text.  If  necessary,  these  activities 
should  be  del.-yed  indefinitely.  The  workspace  should  allow 
the  users  to  place  notes  at  any  position  in  the  current 
workspace,  just  as  they  might  "jot  down*  a  note  in  the  mar¬ 
gin  of  a  paper.  Memory  for  the  context  within  which  the 
note  was  created  will  function  as  a  natural  aid  to  retrieval 
of  the  note  long  after  it  was  written.  When  the  user  is 
ready  to  devote  time  to  the  task,  the  note  could  be  named, 
duplicated,  and/or  relocated.  The  system  would  also  allow 
the  user  to  insert  reminders  and  identifiers  within  the 
workspace  itself,  for  quick  reference  on  the  purpose  or  con¬ 
text  of  that  workspace. 

Suspending  and  resuming  activities 

Our  observation  that  users  rarely  complete  any  time- 
consuming  activity  before  beginning  another  task  suggests 
that  a  mechanism  for  easily  suspending  the  current  activity 
is  essential.  For  example,  we  have  found  that  while  doing  a 
task,  users  often  encounter  some  problem  that  should  be 
fixed  immediately.  This  means  stopping  the  main  task, 
doing  the  fix  or  housekeeping  chore,  and  then  resuming  the 
initial  task.  These  contextually-driven  activities  we  call 
’digressions*.  While  they  are  common,  and  often  a  meant  of 
accomplishing  many  secondary  goals  which  might  be  forgot¬ 
ten  otherwise,  they  can  distract  a  user  from  the  goals  of  the 
main  task. 

The  workspace  system  should  support  digressions 
while  providing  a  large-scale  place  bolder  to  facilitate  easy 
return  to  previous  activities.  Restarting  an  activity  should 
return  the  user  to  the  precise  state  and  location  within  the 
environment  which  was  frozen,  thereby  freeing  users  from 
the  difficult  task  of  remembering  what  they  were  doing  pre¬ 
viously.  Some  systems  currently  support  this.  Berkeley 
Unix  allows  tasks  to  be  stopped  and  resumed,  but  it  does 


not  readily  support  retention  of  the  full  context.  Some  of 
the  new  computer  systems  with  ’window*  facilities,  such  as 
the  Xerox  STAR  (Smith,  Irby,  Kimball  St  Verplank,  1982), 
support  certain  digressions  by  allowing  users  to  maintain 
multiple  windows  on  the  screen  and  to  enlarge  or  shrink 
these  windows  and  switch  between  them  at  will.  Saving 
groups  of  functionally  related  windows  together  as  a 
workspace  would  help  preserve  the  context  of  an  activity  in 
units  which  are  more  meaningful  to  users. 

Maintaining  records  of  activities 

The  history  of  a  user's  activity  within  a  workspace 
constitutes  a  record  of  the  command  sequence  for  perform¬ 
ing  that  activity.  We  call  this  command  sequence  an  Activity 
Script.  Activity  scripts  are  useful  whenever  a  user  wants  to 
perform  an  activity  similar  to  something  done  before.  If  the 
desired  activity  it  exactly  the  tame  as  a  previous  one,  the 
user  should  be  able  to  redo  the  old  activity  script.  If  there 
are  only  small  differences,  the  user  can  edit  the  script  and 
then  execute  it.  If  there  are  major  differences,  the  old 
script  can  still  be  very  useful  as  a  guide  to  the  new  activity. 
No  special  effort  should  be  involved  in  creating  an  activity 
script,  for  they  are  simply  records  of  transactions.  Activity 
scripts  could  be  used  to  support  sophisticated  redo/undo 
facilities  which  would  act  on  sequences  of  commands  at  the 
task  level,  rather  than  just  on  single  command  lines. 

Functional  groupings  of  activities 

We  expect  that  users  will  want  to  organize  their 
workspaces  to  reflect  functional  groupings.  All  material 
associated  with  a  given  activity  or  large  group  of  activities 
could  then  be  accessed  as  a  unit  allowing  more  continuity 
between  work  sessions.  For  example,  the  functional  rela¬ 
tionship  between  elements  of  a  workspace  may  be  as  loosely 
defined  as  the  group  of  tasks  the  user  desires  to  accomplish 
over  the  next  few  sessions.  The  workspace  contents,  in  tbit 
case,  function  much  like  a  To  Do*  list.  The  potential  for 
hierarchical  grouping  of  workspaces  means  that  this  *To  Do* 
workspace  might  have  any  number  of  lower-level 
workspaces  within  it  that  could  be  maintained  in  any  state 
of  completion,  perhaps  with  notes  as  reminders  of  what  to 
do  next.  When  the  workspace  it  reactivated,  the  user  would 
be  shown  where  work  was  discontinued,  and  the  activity 
history  list  would  be  available  for  reference. 

Multiple  perspectives  on  the  work  environment 

One  consequence  of  creating  complex  work  environ¬ 
ments  is  that  users  often  lose  track  of  their  goals  and 
current  location  within  the  system.  Multiple  window 
displays  do  not  in  themselves  solve  this  problem,  as  the 
'messy  desk'  phenomenon  can  appenr  with  a  vengeance.  We 
address  this  problem  by  proposing  a  set  of  tools — and  ulti¬ 
mately  displays — that  allow  users  to  organize  tbeir  activities. 
The  rich  interconnections  among  activities  require  a  support 
system  that  allows  users  to  have  multiple  perspectives  on 
tbeir  activities.  Such  perspectives  would  include  displays 
indicating  activity  sequences  based  on  such  measures  at  tem¬ 
poral  ordering  and  goal  ordering,  this  latter  being  useful  in 
keeping  track  of  the  many  subtasks  necessary  to  achieve  a 
goal.  Possibly  other  information  (eg.,  the  temporal  depen¬ 
dencies  between  tasks)  could  also  be  presented.  2 
1.  The  Apple  USA  computer  system,  for  instance,  has  a  PERT 
program  that  displays  the  temporal  dependencies  existing  between 
different  tasks. 
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Interdepend  enciet  among  items  in  different  workspaces 

Within  •  complex  goal-structured  environment,  many 
items  (eg.  text  files  and  activity  scripts)  may  be  important 
for  more  than  one  task.  In  practice,  this  means  that  very 
often  the  user  may  want  to  assign  a  portion  of  one 
workspace  to  another  workspace.  It  is  therefore  important 
to  support  the  ability  to  have  multiple  instances  of  a  partic¬ 
ular  file  or  note. 

Updating  ooe  instance  of  such  an  item,  however,  leads 
to  the  question  of  whether  or  not  to  generaliee  this  update 
to  all  or  several  instances.  In  any  viable  system  the  user 
should  be  able  to  trace  the  relations  between  workspaces 
and  files  and  be  able  to  locate  multiple  instances  of  a  partic¬ 
ular  item.  This  facility  might  enable  the  system  to  query 
the  user  whenever  an  item  duplicated  in  other  workspaces  is 
updated. 

Summary 

In  our  empirical  data,  the  annotations  that  users 
addad  to  their  history  lists  showed  that  they  view  their 
interne tiaas  with  the  computer  in  terms  of  goals  rather  than 
system  commands.  Accomplishing  these  goals  involves 
translating  them  into  a  series  of  commands,  actually  execut¬ 
ing  the  commands,  and  evaluating  the  result  for  success.  In 
the  course  of  this  process,  users  often  become  confused 
about  their  overall  goals  or  loss  track  of  their  secondary 
goals.  Our  plan  is  to  provide  users  with  a  system  for  manag¬ 
ing  activities  which  maps  well  onto  the  users'  own  goal 
structures,  thus  reducing  the  mental  load  an  the  users. 

Our  currant  research  activities  focus  on  both  empirical 
and  theoretical  developments.  We  have  built  a  working  sys¬ 
tem  (Notepad)  to  explore  in  detail  the  issues  involved  in 
how  best  to  postpone  certain  housekeeping  chores  while  the 
user  is  busy  creating  new  text.  ’  This  system  also  allows  us 
to  explore  questions  concerning  bow  to  manipulate,  save 
and  retrieve  contexts.  In  addition,  we  are  developing  the 
concept  of  a  workspace  as  an  integrating  idea  that  has  impli¬ 
cations  for  human-computer  interface  design. 


References 

Huff,  K.  E.,  A  Lesser,  V.  R.  Knowledge-Based  Command 
Understanding:  As  Example  for  the  Software  Develop¬ 
ment  Environment.  Amherst,  Massachusetts:  Computer 
and  Information  Science,  University  of  Massachusetts, 
Amherst.  June  30,  1982.  (Technical  Report  S2-6.) 

Perlman,  G.  Two  Papers  in  Cognitive  Engineering:  The 
design  of  an  interface  to  a  programming  system,  and 
MENUNIX:  A  menu-based  interface  to  UNIX  (User 
manual).  La  Jolla,  California:  Center  for  Human  Infor¬ 
mation  Processing,  University  of  California,  San  Diego. 
November,  1981.  (Report  No.  8105.) 

Smith,  D.  C.,  Irby,  C.,  Kimball,  R.,  i  Vcrplank,  B.  Design¬ 
ing  the  Star  User  Interface.  Byte.  1982,  7  (No.  4:  April), 
242  -  282. 

Teitelman,  W.,  &  Masinter,  L.,  The  Interlisp  programming 
environment.  Computer,  2981,  !4.  (April,  No.  4),  25  -  33. 


3.  It  is  useful  hen  to  introduce  another  aspect  of  the  work  on  ac¬ 
tivity  structures.  Cypher  has  designed  a  system  called  Notepad  for 
collecting  end  organizing  textual  notes.  The  principles  of  organiza¬ 
tion  embodied  in  Notepad  are  central  in  complex  activity  structur¬ 
ing.  A  note  can  he  input  into  the  system  without  first  specifying  a 
name  or  the  location  of  where  it  should  be  stored.  The  note  ap¬ 
pear!  as  a  subset  of  the  current  environment  and  is  identified  by  a 
number  (temporal  information).  If  the  note  is  left  uncompleted 
when  the  user  switches  to  another  task,  the  system  signals  the  uset 
that  there  is  aa  uncompleted  task  and  supports  reactivating  the 
note  for  further  manipulation.  At  any  time,  the  ante  may  be 
named,  renamed,  or  relocated.  Notes  are  hierarchically  organised 
with  mechanisms  which  allow  the  user  to  specify  the  structural 
family  ia  terms  of  parents,  children,  and  sibling  notes.  A  note  can 
have  multiple  perents,  meaning  that  it  can  be  available  within  any 
appropriate  context  tod  eta  be  easily  retrieved. 
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Abitract 

This  pap  *r  outlines  a  set  of  proposals  for  the  development 
of  system  documentation  based  on  an  analysis  of  user  needs. 
It  is  suggested  that  existing  documentation  is  not  sensitive 
enough  to  the  variety  of  levels  of  uses  expertise,  nor  to  the 
variety  of  contexts  in  which  on-line  help  is  required.  We 
outline  three  specific  proposals  for  fulfilling  these  needs:  a 
quick  reference  facility,  a  command-line  database,  and  a 
facility  for  full  explanation  and  instruction,  and  suggest  a 
number  of  ways  in  which  users  might  access  these  facilities. 
Finally,  we  suggest  a  way  of  combining  theta  facilities  into 
an  integrated  structured  manual,  offering  more  effective 
user  support  than  is  currently  provided. 

Introduction 

This  paper  outlines  a  project  on  the  development  and 
use  of  system  documentation  that  is  currently  underway  as  a 
part  of  the  Human  Machine  Interaction  project  at  UCSD. 
The  goal  of  the  project  is  twofold:  to  develop  a  conceptual 
framework  for  discerning  the  documentation  needs  of.  com¬ 
puter  users,  and  to  implement  a  number  of  software  tools 
that  will  demonstrably  improve  the  user  support  currently 
provided  on  our  UNIX  1  system. 


This  research  was  conducted  trader  Contract  N0001S-1WC-0323, 
NR  447-437  with  the  Personnel  and  Training  Research  Programs  of 
the  Office  of  Naval  Research  and  by  a  great  from  the  System 
Development  Foundation.  Requests  for  reprints  should  he  rent  10 
the  Institute  for  Cognitive  Science  C-015:  University  of  California. 
Sea  Diego;  La  Jolla.  California,  92093,  USA. 


1.  UNIX  is  «  trademark  of  Bell  Laboratories.  The  comments  m 
this  paper  refer  to  the  s.l  BSD  verstoo  developed  at  the  University 
of  California.  Berkeley. 

To  be  published  in:  Proceedings  of  the  CHI  !9U  Coherence 
on  Hunan  Feelers  :> t  Composer  Systems.  Boston:  December, 
1983. 


The  need  for  providing  on-line  system  documentation 
has  become  widely  accepted  (Girill  and  Luk,  1983;  Glushko 
and  Bianchi,  1982).  However,  with  the  increasing  volume  of 
documentation  available  on  tine,  users  have  difficulty 
finding  relevant  information  at  the  appropriate  level  of 
detail.  The  on-linn  documentation  facilities  outlined  in  this 
paper  are  the  first  step  toward  our  overall  goal  of  providing 
a  comprehensive,  structured  system  manual  that  would 
implement  a  version  of  the  'hypertext'  concept  that  has 
been  discussed  by  Nelson  (1974).  The  basic  idea  is  that  of 
structuring  information  into  small,  richly  interconnected 
'chunks*.  Users  could  be  allowed  to  access  the  database 
containing  these  chunks  by  a  variety  of  methods,  and  could 
browse  through  the  pieces  of  information  or  expand  on  an 
arbitrary  item  at  will. 

The  UNIX  Environment 

UNIX  ia  a  very  large  and  powerful  operating  system 
comprising  over  700  commands.  These  various  programs 
were  built  up  in  a  piecewise  fashion  by  a  myriad  of 
different  programmers,  and  the  available  features  are  a  con¬ 
stantly  expanding  set  with  little  top-down  organization 
imposed  on  them.  The  exploitation  of  tb:  idea  of  modular 
design  of  software  tools  proves  to  be  a  very  positive  feature 
when  viewed  from  a  system  perspective.  It  offers  extremely 
flexible  tools,  each  of  which  can  be  used  in  varied  environ¬ 
ments,  in  varied  combinations,  and  for  disparate  purposes. 
From  the  point  of  view  of  the  user  however,  this  modularity 
can  have  disadvantages,  especially  since  it  is  not  directly 
accessible  or  explicitly  represented.  Users  have  difficulty 
understanding  bow  the  different  modules  are  intercon¬ 
nected  and  in  determining  the  most  effective  way  to  per¬ 
form  a  task.  Although  modularity  leads  to  cutting  across 
program  boundaries,  standard  documentation  is  still  struc¬ 
tured  around  the  level  of  the  program.  Thus,  beginning 
users  with  very  basic,  simple  tasks  to  perform  find  it 
difficult  to  learn  how  to  communicate  their  requests,  and 
even  more  difficult  to  come  to  a  conceptual  understanding 
of  the  system.  Indeed,  even  experienced  programmers  are 
often  unaware  of  the  full  range  of  software  tools  that  peo¬ 
ple  have  developed  for  the  system,  as  some  of  our  data 
explicitly  indicate. 

The  problem  of  inadequate  documentation  for  the 
UNIX  system  has  been  voiced  repeatedly  by  many  people, 
both  at  the  level  of  specific  program  documentation  and  ai 
the  more  general  level  of  user  support  documentation. 
Existing  documentation  is  not  sensitive  to  the  level  of  exper- 
tiic  of  the  user.  There  is  a  need  for  a  variety  of  user  sup- 
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port,  from  the  provitioa  of  mental  models  to  assist  in 
conceptual  understanding  to  tutorials  and  on-line  help  facil¬ 
ities. 

One  proposal  for  improving  both  the  quantity  and 
quality  of  system  documental  ion  is  to  transfer  the  bulk  of 
this  task  from  programmers  and  specialist  documentation 
writers  to  iba  users  themselves.  Such  an  approach  has  a 
number  of  interesting  consequences,  some  of  wbich  are 
explored  in  Draper  (190).  Our  philosophy  is  that  the 
design  of  documentation  is  an  iterative  process  of  collecting 
information  on  user  needs,  designing  an  integrated  set  of 
tools  to  support  these  needs,  and  evaluating  the 
effectiveness  of  these  tools.  We  believe  that  a  structured 
approach  to  supporting  the  needs  of  the  user  and  the  desip 
of  software  tools  is  necessary  to  ensure  integration  of  user 
support  tools  and  appropriate  user  feedback. 

User  Needs 

We  have  examined  how  people  use  the  help  facilities 
existing  on  our  UNIX  system  by  monitoring  the  use  of  the 
on-line  manual,  and  by  soliciting  feedback  from  users  con¬ 
cerning  the  information  they  were  trying  to  obtain.  Our 
analyses  have  led  to  the  identification  of  three  facilities  that 
users  may  require: 

(1)  quick  reference; 

(2)  cask  spaa  Sc  help; 

(3)  full  explanation. 


The  need  for  a  quick  reference  facility  was  identified 
from  the  on-line  comments  furnished  by  users  as  they 
sought  information  in  the  manual.  Users  said  that  they  had 
received  far  morn  information  than  they  actually  wanted, 
that  they  were  simply  trying  to  condrm  the  name  of  a  com¬ 
mand  and  its  effect,  or  that  they  merely  wanted  to  check  on 
the  dap  or  options  which  the  command  used.  In  other 
words,  users  need  to  be  able  to  verify  the  name  of  a  pro¬ 
gram,  or  check  on  flap,  options  and  syntax,  without  having 
to  scan  through  extraneous  material. 

Task  specific  or  functional  needs  were  identified  by 
requests  for  help  for  grouping!  of  commands  that  perform 
similar  operations  or  different  operations  on  similar  objects. 
There  it  a  need  for  help  that  is  sensitive  to  context  or  for 
some  kind  of  database  retrieval  facility  that  supports  multi¬ 
ple  perspectives  on  the  same  data. 

It  is  clear  that  there  is  often  a  need  for  a  full  descrip¬ 
tion  and  explanation  of  the  operation  of  a  program,  la 
tome  cases  this  might  simply  require  display  of  the  full 
manual  entry  on  a  command.  However,  other  eases  might 
require  a  relatively  lengthy  tutorial,  for  example  one 
explaining  the  operation  of  an  editor.  Based  on  these  obser¬ 
vations.  we  would  like  to  propose  a  set  of  potential  solu¬ 
tions  to  these  problems. 

Qwci  Rtf  tr  tnct 

Many  users  who  aead  on-line  assistance  are  already 
familiar  with  the  commands  they  wish  to  uta  and  need  only 
a  verification  of  a  command  same  or  a  reminder  of  proper 
syntas  and  possible  options.  Currently  they  must  search 


through  a  lengthy  manual  entry  to  find  tbit  information. 
Since  these  users  are  not  interested  in  a  complete  explana¬ 
tion  of  commands,  they  would  be  better  served  by  an  abbre¬ 
viated  reference  manual.  To  meet  this  need,  we  recommend 
a  ‘Quick  Reference*  on-line  manual  that  contains  only  tbe 
correct  syntax  of  the  command,  a  list  of  possible  options, 
and  a  minimum  of  explanation.  Each  quick  reference 
manual  entry  should  also  be  capable  of  directing  users  to 
other  sources  of  information  (i.e.,  regular  manual  entry, 
tutorials,  etc.)  should  they  need  a  more  complete  explana¬ 
tion. 

We  have  constructed  such  a  quick  reference  facility 
for  tbe  printing  commands  available  on  our  system.  As  an 
aid  to  the  evaluation  of  this  facility,  transcripts  of  use  are 
being  recorded  for  data  analysis.  We  have  been  exploring  a 
means  of  enabling  users  to  provide  feedback  on  the  utility 
of  the  system  that  is  highly  specific  yet  requires  minimal 
user  effort.  When  users  have  finished  viewing  tbe  informa¬ 
tion  on  a  printing  command,  they  can  indicate  that  that 
specific  use  of  the  quick  reference  entry  wu  helpful  by 
using  an  upper  case  menu  selection  to  continue;  tbe  lower 
case  selection  achieves  the  same  effect  without  registering  a 
success.  In  our  preliminary  data  upper  case  selections  out¬ 
number  lower  case  selections  by  a  significant  margin.  Furth¬ 
ermore,  the  data  rule  out  simple  case  perseveration  by  indi¬ 
vidual  users,  contradicting  the  expectations  of  several  col¬ 
leagues.  More  detailed  assessment  of  tbe  facility  must  await 
further  data. 

Task-Spteifie  Htlp—A  Command  Lint  Database 

The  concepts  that  aew  users  bring  to  bear  on  a  task 
domain  often  do  not  map  well  onto  the  concepts  incor¬ 
porated  into  the  system.  Thus  the  user  may  be  able  to  ver¬ 
balise  tbe  task  to  be  performed  but  bn  unable  to  formulate 
tbo  corresponding  command.  A  task-specific  help  facility 
should  address  both  the  need  to  casks  explicit  tbe  conceptu¬ 
alizations  of  tasks  that  are  embodied  in  the  system  and  tbe 
need  to  match  the  user’s  descriptions  of  the  task  to 
appropriate  commands. 

In  UNIX,  a  typical  command  line  may  utilize  several 
different  programs.  It  is  often  difficult  for  tbe  user  to  know 
the  various  ways  programs  can  interact.  We  believe  this 
makes  tbe  program  level  inadequate  as  tbe  sole  level  for 
documentation.  Since  the  meaningful  units  in  UNIX  are  com¬ 
binations  of  modules  that  form  command  lines,  an  impor¬ 
tant  level  at  which  to  document  tbe  system  is  that  of  tbe 
command  line.  This  would  supplement  documentation  at 
the  program  level. 

Our  monitoring  of  tbe  use  of  existing  help  facilities 
revealed  groupiop  of  commands  wbich  reflected  a  func¬ 
tional  organization.  Users  were  seeking  help  for  a  particular 
task  by  looking  for  commands  wbich  were  related  to  each 
other  in  meaningful  ways.  Having  identified  the  need  for  a 
semantic  organization  in  tbe  manual  database,  we  examined 
the  functional  units  or  command  lines  involved  in  one  par¬ 
ticular  task  domain— that  of  printing.  There  are  over  thirty 
programs  immediately  related  to  printing  documents  on  our 
system,  apart  from  the  preprocessors  and  postprocessors 
that  are  used  with  these  programs.  Most  of  these  programs 
accept  at  least  half  a  dome  options  or  flap,  depending  on 
the  particular  task  to  be  performed.  Moreover,  they  can 
combine  with  each  other  in  e  variety  of  weys,  thus  greatly 
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increasing  (he  number  of  actual  functional  units— the  com¬ 
mand  lines.  Some  meaningful  dimensions  along  which  these 
units  are  being  organized  include  the  type  of  terminal  used, 
whether  hard  or  soft  copy  printing  is  required,  or  whether 
formatting,  paging  or  line  numbering  is  required.  This  large 
number  of  dimensions  makes  the  task  of  organizing  a  com¬ 
mand  line  database  difficult,  yet  such  a  facility  could  be 
extremely  useful  both  to  experienced  as  welt  as  inexperi¬ 
enced  users. 

We  recommend  that  each  entry  in  the  database  con¬ 
tain  a  command  line  that  performs  a  specific  task  (printing 
or  searching,  for  example),  together  with  an  explanation  of 
that  task.  When  several  command  lioes  can  do  a  given  task, 
the  several  synonyms  could  be  included  within  a  single 
entry,  and  the  subtle  differences  could  be  explained.  In 
addition  to  explanations  of  the  command  lines  as  a  whole, 
each  entry  could  provide  access  to  existing  documentation 
on  each  individual  program  and  its  numerous  flags.  Pointers 
to  relevant  existing  tutorials  could  also  be  included.  Finally, 
a  mechanism  for  browsing  among  command  line  entries  that 
perform  related  tasks  could  be  provided. 

The  database  could  be  constructed  by  first  analyzing 
user  needs  to  determine  what  tasks  need  to  be  described, 
then  consulting  system  experts  to  determine  the  appropriate 
command  line  components.  Several  access  mechanisms  can 
be  provided  to  present  the  user  with  a  functionally  organ¬ 
ized  view  of  the  command  line  database:  for  example, 
specially-designed  entry  through  English  terms,  entries 
through  specification  of  values  for  attributes,  entry  through 
approximate  pattern-matching,  or  keyword  search.  We 
recommend  that  more  than  one,  if  not  all,  of  these  access 
mechanisms  be  made  available. 

The  mapping  between  command  line  entries  and  users' 
descriptions  should  be  many-to-many.  For  example,  the 
word  'delete*  could  access  entries  for  deleting  a  file,  a  direc¬ 
tory,  a  command  line,  a  character  in  a  Ale,  and  the  com¬ 
mand  line  for  deleting  a  file  could  be  accessed  by  the  words 
'delete',  'remove',  'rm",  'unlink*.  The  access  mechanisms 
should  be  available  not  just  to  provide  an  initial  set  of 
entries  in  the  command  line  database,  but  also  to  pass  from 
one  such  set  to  another  set  containing  entries  similar  to  the 
previous  ones  but  differing  in  some  user-determined 
respects. 

Two  mechanisms  for  generating  database  entries  can 
be  employed.  Within  task  domains  that  have  been  character¬ 
ized  by  conceptual  dimensions,  command  lines  possessing 
the  legal  combination  of  attributes  can  be  specifically 
designed.  Secondly,  users'  command  lines  can  be  automati¬ 
cally  collected  and  edited  for  inclusion  in  the  database,  and 
users  can  be  asked  to  give  a  description  of  the  task  they 
hope  to  perform.  The  fact  that  these  examples  come  from 
users  themselves  could  greatly  increase  the  utility  of  the 
database.  This  data  can  also  be  used  to  compile  statistics  on 
the  most  commonly  used  commands,  enabling  default 
assignments  for  task  dimension,  for  use  when  only  a  few 
attributes  are  specified.  This  would  also  provide  continued 
evaluation  of  the  facility. 

Full  Explanation 

The  command  line  database  does  not  eliminate  the 
need  for  a  full  explanation  of  the  capabilities  of  an  indivi¬ 


dual  program.  Existing  documentation  at  the  program  level 
should  be  revised  in  order  to  improve  the  metalanguage  for 
describing  syntax,  to  standardize  tbe  structure,  to  increase 
consistency  of  useage  of  terms,  and  to  improve  readability 
The  command  lines  in  the  database  that  use  a  given  pro¬ 
gram  could  be  referred  to  by  the  documentation  on  that 
program,  .thus  greatly  increasing  tbe  oumber  of  examples  in 
the  documentation. 

Another  need  exists  for  the  documentation  of  system 
features  that  cross  program  boundaries.  The  task  dimen¬ 
sions  developed  for  the  command  line  database  offer  a 
framework  for  organizing  such  documentation.  Users  could 
have  access  to  an  explanation  of  any  dimensions  or  attri¬ 
butes  chosen  when  they  are  performing  command  line  data¬ 
base  retrieval.  Full  explanation  of  these  dimensions  could 
do  much  to  further  the  user's  adoption  of  appropriate  task 
conceptualizations  for  the  system.  This  relates  to  the  more 
general  issue  of  how  to  introduce  new  users  to  tbe  system 
and  how  effective  these  dimensions  are  in  helping  the  new 
user  form  an  appropriate  mental  model  of  system  features. 

Integrated  Structured  Manual 

The  proposals  for  projects  outlined  above  draw  up< 
information  about  VStX  programs  that  is  highly  interrelate* 
and  intimately  bound  to  the  information  traditionally  cor 
tained  within  a  user's  manual.  As  these  individual  projec 
reach  completion,  it  should  then  be  possible  to  dra 
together  tbe  information  in  tbe  database  that  each  facili 
uses  and  develop  an  integrated  database  to  be  accessed  b 
routines  that  perform  the  various  help  functions.  This  data¬ 
base  should  contain  synopses  of  programs  (as  used  by  tbe 
quick  reference  facility),  many  examples  of  complete  com¬ 
mand  lines  together  with  a  description  of  their  effects,  com¬ 
plete  descriptions  of  programs  (as  used  by  a  conventional 
on-line  manual),  and  tutorials  on  aspects  of  tbe  system  that 
often  cut  across  program  boundaries.  Tbe  high  degree  of 
interconnection  between  items  of  information  in  this  data¬ 
base  could  lend  itself  to  representation  as  a  network:  we 
refer  to  this  network  as  tbe  'structured  manual.'  This 
manual  will  be  one  realization  of  Nelson's  (1974)  vision  of 
•hypertext.' 

The  structured  manual  would  contain  a  wealth  of 
information  about  tbe  UNIX  system.  This  information  can 
be  broken  down  into  rather  small  units,  such  as  one-line 
descriptions  of  program  Cap,  individual  sample  command 
lines,  and  individual  short  parapapbs  containing  tutorial 
text,  and  can  be  interconnected  by  aetwork  links  expressing 
mutual  relationships.  Meta-information  can  also  be  con¬ 
tained  in  the  network:  information  about  which  tutorial 
paragraphs  are  prerequisites  of  a  given  paragraph  and  about 
which  information  it  appropriate  only  for  novices  or  of 
interest  only  to  experts.  The  network  links  could  be  used 
by  a  browser  that  would  enable  easy  passage  between 
related  pieces  of  information,  so  that  good  support  would 
be  provided  for  expansion  of  knowledge  about  the  system. 

We  believe  that  these  proposals  are  a  positive  step 
towards  providing  user  documentation  that  is  both  sens-Vve 
to  the  level  of  expertise  of  tbe  user  and  :apitalizes  on  the 
flexibility  of  modular  systems. 
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Abstract 

This  paper  reports  on  a  study  investigating  the 
strengths  and  weaknesses  of  questionnaires  as  software 
evaluation  tools.  Two  oa)or  influences  on  the  usefulness  of 
questionnaire-based  evaluation  responses  are  examined:  the 
administration  of  the  questionnaire,  and  the  background 
and  experience  of  the  respondent.  Two  questionnaires  were 
administered  to  a  large  aumber  of  students  in  an  introduc¬ 
tory  programming  class.  The  questionnaires  were  also  given 
to  a  group  of  more  experienced  users  (including  course 
proctors).  Respondents  were  asked  to  evaluate  the  text  edi¬ 
tor  used  in  the  class  along  a  aumber  of  dimensions;  evalua¬ 
tion  responses  were  solicited  using  a  number  of  different 
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question  types.  Another  group  of  students  received  the 
questionnaire  individually,  with  part  of  it  presented  oo  the 
computer;  a  third  group  also  evaluated  an  enhanced  version 
of  the  editor  in  followup  sessions. 

Introduction 

A  common  sentiment  expressed  by  those  trying  to  free 
themselves  of  the  bad  aspects  of  user-indifferent, 
programmer-dominated  software  design  is:  'Ask  the  user’s 
opinion  of  the  interfacef.  The  editor  questionnaire  project 
described  here  attempts  to  explore  tbe  usefulness  of  one 
obvious  interpretation  of  this  maxim  -  asking  for  tbe  user’s 
evaluation  of  a  system,  in  this  case  a  screen-oriented  editor, 
through  the  use  of  questionnaires. 

Questionnaires  and  Software  Evaluation  Methodology. 
We  take  the  view  that  software  technology  should  be.  and 
probably  is,  moving  towards  greater  concern  for  the  user  in 
the  design  of  the  interface:  the  logical  endpoint  of  this  pro¬ 
cess  is  tbe  adoption  of  an  objective  evaluation  of  tbe  user 
interface  to  a  program  as  standard  'good  practice',  in  much 
tbe  same  way  that  well-commented,  well-structured  code  is 
currently  a  mark  of  well-designed  software.  Realization  of 
ibis  goal  will  require  the  development  of  software  evalua¬ 
tion  techniques  that  are  1)  inexpensive.  2)  easy  for  software 
learns  io  apply,  and  3)  effective  at  identifying  tbe  good  _nd 
bad  aspects  of  an  interface.  While  a  questionnaire  metho¬ 
dology  seems  promising,  as  it  clearly  meets  tbe  requirements 
of  inexpensiveness  and  ease  of  application,  the  question  of 
effectiveness  remains  unanswered.  This  is  the  maior  focus 
of  this  study. 

A  prion  there  seem  to  be  some  grounds  for  both  opti¬ 
mism  and  pessimism  regarding  the  effectiveness  of 
questionnaire-based  evaluations.  Oa  the  one  hand,  users 
are  likely  to  encounter  problems  that  tbe  software  designer 
has  not  foreseen  and  any  articulation  of  these  problem  arias 
may  be  adequate  to  allow  the  designer  to  track  down  and 
correct  tbe  flaws.  On  tbe  other  band,  'naive'  users  (tbose 
with  little  or  no  relevant  computer  experience)  may  be 
unable  to  mount  a  useful  critical  response,  particularly  it 
such  responses  must  be  based  either  on  comparisons  with 
other  systems  ot  on  tbe  user’s  own  model  (built  up  over 
time  and  use)  of  what  interfaces  should  be  like.  Experts 
however  can  be  hard  to  come  by  especially  for  new  systems. 
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or  cxpoanvo.  Thu  study  it  designed  to  investigate  these 
uacertaiauet  shout  the  usefulaeia  of  'asking  (he  user* 
through  the  medium  of  questionnaires. 

Overview  of  the  Study 

The  mi  tor  research  questions  concern: 

1)  the  effects  of  different  types  of  questions  oa  the  quality 
of  (he  responses;  2}  the  effects  of  user  experience  on  the 
quality  ot  the  responses;  and  3)  the  effects  of  the  method 
of  administering  the  questionnaire  on  the  quality  of  the 
responses. 

The  main  test  population  for  the  questionnaire  study 
was  the  introductory  programming  course  in  UCSD  Pascal 
at  the  University  of  California,  San  Diego.  The  course  is 
taken  by  several  hundred  students  each  quarter;  for  most  of 
them  the  darn  is  their  fir*  ugnidcaat  exposure  to  computing 
sad  interactive  editors.  The  daaa  is  self-paced  aad  students 
ire  assisted  by  20*30  undergraduate  proctors  who  have 
already  compiated  tome  instruction  in  Pascal  programming. 
Thus  we  had  access  to  a  large  sample  of  relatively  naive 
users  aad.  in  the  same  class,  a  smaller  sample  of  relatively 
experienced  users  with  whom  to  compare  answers.  Further* 
mors,  the  UCSD  Pascal  system  is  widely  used  oa  tha 
campus,  affording  us  access  to  a  larger  sample  of  experi¬ 
enced  users.  Finally  (ha  use  of  students  in  an  instructional 
setting  as  evaluators  lends  greater  generality  to  (he  results 
since  (his  makes  it  more  of  a  held  study  than  a  laboratory 
one. 

We  chose  to  examine  tha  screen  editor  because  editors 
as  a  group  are  probably  the  most  extensively  used  interac¬ 
tive  programs  of  all.  because  the  editor  msut  be  used  in 
order  to  write  any  program  in  Pascal,  and  because  editors 
are  heavily  interactive  and.  therefore,  contain  many  of  the 
user-interface  qualities  we  wish  to  be  able  to  evaluate. 

We  did  not  pick  tbe  Pascal  editor  (or  (be  UCSD  Pascal 
system)  because  it  has  an  exceptional  number  of  3aws,  bugs 
or  traps  in  it.  It  does  not  appear  to  be  a  bad  editor.  We 
picked  it  (or  reasons  of  convenience  and  familiarity. 

We  administered  a  first  questionnaire  in  (be  first  week 
of  the  course  and  a  second  questionnaire  in  tbe  seventh 
week  of  tbe  course.  At  about  the  same  time.  20-30  students 
were  given  the  second  questionnaire  oa  an  individual  basis 
with  pan  of  it  implemented  on  the  computer;  another  group 
of  20-30  users  was  exposed  to  an  enhanced  version  of  the 
Pascal  editor  to  compart  tbetr  opinions  of  new  editor 
features  before  and  after  exposure,  la  addition  both  ques¬ 
tionnaires  were  given  to  the  course  proctors  at  the  same 
time  as  they  were  given  to  the  students  in  (he  class. 

Method 

Categories  of  user  experience.  Subjects  were  classified 
along  two  independent  experience  dimensions:  experience 
with  the  Pascal  editor  itself,  and  experience  with  other  edi¬ 
tors.  The  first  dimension  was  provided  by  the  distinction 
between  the  students  in  this  course  aad  the  proctors  (under¬ 
graduate  teaching  assistants).  We  obtained  the  latter  infor¬ 


mation  from  questionnaire  one.  administered  in  tbe  6ru 
week  of  class  to  all  students  and  proctors. 

Typos  of  question  used.  Questionnaire  two  was  designed  to 
elicit  information  about  tbe  user's  knowledge  of  tbe  editor 
aad  tbe  weaknesses  of  its  interface.  We  used  tbree  types  of 
qui  ms:  checklists  that  list  all  tbe  editor  commands  and 
ask  (oa  3  point  scales)  about  (be  respondent's  knowledge, 
use  of  and  problems  with  each  command;  specific  question 
about  problems  with  existing  features  (both  commands  and 
other  features)  and  about  proposed  modifications,  and  j*r- 
tral  (open-ended)  questions  about  complaints  and  desired 
changes  to  the  existing  editor  (e  g.  'are  there  any  (other) 
commands  or  traps  you  would  like  to  see  changed  m  an 
improved  version  of  tbe  editor?*).  The  checklist*  asked  on 
3-point  scales,  for  each  command,  whether  tbe  user  knew  it. 
avoided  it,  whether  it  was  dangerous,  awkward  to  use.  hard 
to  type,  had  a  difficult  syntax,  or  whether  it  wax  hard  to 
predict  its  outcome. 

Identifying  problem  areas.  There  are  two  kinds  of 
concerns  that  a  software  evaluator  may  have:  to  identify 
problem  areas  tbat  may  not  even  have  been  suspected 
before,  and  to  get  more  information  on  existing  hypotheses 
soout  problem  areas.  The  first  concern  is  addressed  by  ibe 
checklists  and  the  general  questions;  both  yield  lists  of 
features  perceived  aa  having  problems.  The  respondent’s 
answer*  may  be  examined  for  internal  consistency  by  com¬ 
paring  the  lists  from  each  type  of  question.  An  important 
measure  of  (be  value  of  this  approach  to  identifying  prob¬ 
lem  areas  is  mter-subject  agreement  (tbs  proportion  of 
subjects  identifying  each  gives  problem  area). 

Tbe  second  concern  is  addressed  by  checklists  and 
specific  questions  about  existing  features  of  tbe  editor. 
Again  ibe  answers  may  be  examined  for  internal  consistency 
by  comparing  tbe  responses  to  these  specific  questioos 
against  the  checklist  responses.  The  specific  questioos  can 
also  be  compared  against  tbe  general  ones  for  their  ability 
to  address  features  of  tbe  editor  that  cannot  be  listed  on  the 
checklists  (which  are  generated  simply  from  an  exhaustive 
list  of  commands)  -  for  iastance  a  aon-command  feature 
such  as  the  global  direction  parameter  (In  the  Pascal  editor 
this  parameter  governs  tbe  direction  of  operation  of  search¬ 
ing  and  paging  commands.)  Specific  questioos  also  allow  the 
questionnaire  user  to  target  particular  topics  for  opeo-eoded 
discussion  by  respondents.  Again  a  measure  of  the  useful¬ 
ness  of  responses  here  is  agreement  between  subjects  on 
identifying  problems  and  proposing  solutions. 

Asking  users  about  proposed  solutions.  The  question 
types  used  here  are  tbe  specific  ones,  concerning  possible 
changes  to  tbe  existing  editor  and  asking  for  rating 
responses  (e  g.  ’How  useful  would  this  change  be  for  you?*) 
and  opinions  (e  g  "Would  you  like  a  delete-word  com¬ 
mand*),  and  the  general  questions  dealing  with  the  users 
desires  for  improvements  m  the  editor.  The  mam  check  on 
the  effectiveness  of  these  questions  at  eliciting  useful 
responses  is  provided  by  (he  followup  study  m  which  33 
respondents  were  given  exposure  to  an  enhanced  version  of 
tbe  editor  incorporating  the  proposed  changes,  and  again 
asked  to  rate  (he  desirability  of  the  features.  A  comparison 
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of  pre •  ud  post-exposure  response]  allows  us  to  judge  the 
reliability  of  subjects'  opmioas  about  proposed  changes. 

Administering  the  questionnaire.  There  it  a  baste  problem 
ia  asking  people  to  evaluate  an  interactive  computer  system 
oa  pencil  and  paper  -  that  using  the  system  is  something 
you  do,  not  primarily  something  you  talk  about.  Thus  peo¬ 
ple  may  have  trouble  recalling  their  editing  experiences  in 
response  to  verbal  references  to  it  -  for  instance  they  are 
probably  not  practised  in  thinking  of  the  editor  in  terms  of 
a  set  of  named  commands  and  features  but  only  in  terms  of 
making  responses  to  editing  problems.  There  is  a  further 
related  problem  -  that  without  a  prior  set  towards  thinking 
of  the  system  with  a  view  to  evaluating  it,  people  may  not 
categorize  and  remember  tbeir  experiences  from  this  point 
of  view.  Thus  when  you  later  ask  them  to  do  to,  they  may 
be  unable  to  recall  much  of  relevance. 

We  ordered  the  questions  with  these  considerations  ia 
mind,  arranging  a  chance  for  the  user  to  recall  not  only  the 
commands  (the  checklists  give  an  exhaustive  list)  but  also 
tbeir  use.  We  also  had  a  section  of  questions  testing  editor 
knowledge.  These  questions  required  two  things  of  the 
respondent:  producing  an  editor  command  sequence  that 
would  achieve  a  given  effect  (the  editor-completion  task)* 
and  predicting  the  effect  of  a  given  command  sequence  on  a 
piece  of  text.  The  relevance  of  these  questions  to  the 
results  reported  here  is  only  their  contributions  to  remind¬ 
ing  the  subjects  of  the  experience  of  using  the  editor. 

To  examine  the  effects  on  responses  of  different 
amounts  of  support  for  recall,  we  used  two  different  condi¬ 
tions  (or  administration  of  the  questionnaire.  In  tbs  first 
(’cold*)  condition  the  paper  and  pencil  version  of  the  ques¬ 
tionnaire  was  offered  to  the  entire  class.  Students  filled  in 
the  questionnaire  in  their  own  time,  handing  it  in  several 
days  after  picking  it  up.  In  the  second  (’hot')  condition  the 
editor-completion  task,  which  was  done  on  paper  in  the 
’cold’  condition,  was  done  on  the  computer.  All  other  sec¬ 
tions  of  the  questionnaire  were  presented  and  completed  in 
exactly  the  same  order  in  both  conditions.  The  hot  condi¬ 
tion  was  given  to  26  student]  who  signed  up  for  a  special 
session  in  which  they  completed  the  questionnaire  and  did 
the  computer  task  These  students  also  received  a  token 
sum  (S5  00)  for  participating  in  the  sessions  on  their  own 
time. 

In  the  cold  condition  the  editor-completion  task  was 
optional.  We  were  led  to  this  by  finding  in  the  pilot  study 
that  some  respondents  either  failed  to  fill  it  out  at  all.  tilled 
it  out  only  partially,  or  complained  vigorously  about  it. 
Thus  the  cold  condition  was  subdivided  into  'cold*  and 
’ultra-cold*,  the  latter  referring  to  those  who  chose  not  to 
complete  the  optional  section.  Out  of  a  class  of  about  375 
students,  we  eventually  recruited  25.  23,  and  13  for  the  hot. 
cold,  and  ultra-cold  conditions  respectively,  besides  11  proc¬ 
tors  (5  hot.  6  cold) 

Results 

The  checklist  questions  enabled  us  first  to  get  a  meas¬ 
ure  of  how  widely  known  each  command  was  (see  table  1). 


Besides  its  intrinsic  usefulness  this  also  allowed  us  to  iden¬ 
tify  wbicb  editor  commands  were  used  by  so  few  users  as  io 
prevent  any  other  significant  information  to  be  gained  about 
them.  Of  the  11  editor  commands  we  judged  3  (’zap*,  ’set*, 
and  ’find*)  to  be  in  this  category  with  69%,  36%  and  39%  of 
users  stating  their  ignorance  of  them  respectively. 

With  the  remainder  we  chose  a  criterion  of  problems- 
tietty  for  a  commaad  based  on  the  other  inquiries  m  the 
checklist.  This  was  that  (on  the  1  to  3  rating  scales  used 
throughout  the  checklists)  cither  the  command  was  avoided 
or  dangerous  al  least  tome  of  the  time,  or  was  awkward  to 
use,  hard  to  type,  with  difficult  syntax,  or  with  an  outcome 
difficult  to  predict  all  of  the  tune.  This  criterion  wu  used 
to  decide  whether  each  user  Sagged  the  command  as  prob¬ 
lematic,  and  a  rank  ordering  of  commands  was  made  based 
on  the  proportion  of  users  flagging  it  as  problematic.  These 
proportions  were  adjusted  for  the  number  o(  users  who 
claimed  to  know  the  command  at  ail  (omitting  altogether 
the  ’zap’,  ’set*,  and  ’find*  commands  which  had  too  few 
knowledgeable  users  to  be  useful).  The  scores  ranged  from 
89%  (replace)  to  23%  (adjust)  (see  table  2).  The  set  consist¬ 
ing  of  the  top  3  commands  selected  by  this  criteria  remained 
robust  across  the  various  experience  groups  and  might  be 
taken  as  the  set  in  need  of  attention. 

table  i 


subject  groups 


command 

nr 

c 

u 

E 

H 

s 

P 

TOTAL 

ZAP 

r»r 

49 

83 

33 

93 

83 

18 

69 

SET 

1  67 

3S 

70 

40 

83 

73 

0 

36 

FIND 

1  » 

21 

34 

33 

39 

36 

0 

38 

REPLACE 

J7 

21 

34 

30 

33 

31 

0 

25 

COPY 

10 

0 

8 

3 

18 

8 

0 

13 

ADJUST 

7 

3 

0 

10 

6 

6 

0 

8 

JUMP 

13 

3 

8 

13 

6 

10 

0 

8 

EXCHANGE 

13 

0 

8 

5 

a 

10 

0 

7 

QUIT 

3 

0 

0 

5 

0 

2 

0 

1 

DELETE 

0 

0 

0 

0 

0 

0 

0 

0 

INSERT  I 

0 

0 

0 

0 

0 

% 

4> 

0 

0 

H*Hot  C -Cold  U »U!tr»Cold  E*Othcr  Editor  Experience 
N“No  Other  Editor  Experience  S-Studeni  P«Proctor 


Table  1:  Percentage  of  tubiects  who  don't  know  each  command 

table: 


subject  groups 


command 

H 

C 

U 

E 

N 

s 

p 

TOTAL 

REPLACE 

$4 

87 

100 

86 

91 

85 

91 

89 

DELETE 

80 

83 

62 

90 

71 

*5 

91 

"3 

COPY 

81 

59 

67 

58 

86 

86 

27 

77 

QUIT 

59 

59 

38 

69 

53 

47 

32 

55 

EXCHANGE 

50 

3* 

a 

42 

53 

42 

55 

39 

INSERT 

'  27 

41 

a 

30 

33 

32 

18 

J2 

JUMP 

19 

32 

a 

12 

19 

a 

0 

26 

ADJUST 

11 

36 

a 

17 

19 

24 

9 

24 

K*Kot  C«Coid  U«UItraCold  E “Other  Editor  Expcncace 
N*No  Other  Editor  Expenence  S-Siudent  P"Proctor 

Table  2:  Percentage  o<  Subiects  indicating 
a  problem  with  each  command 
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One  of  tbc  specific  questions,  concerning  the  direction 
parameter,  could  be  analyzed  in  a  similar  manner  desipte  its 
different  phrasing,  in  order  to  provide  a  way  of  comparing 
in  effectiveness  with  the  checklist  type  of  question.  Users 
were  asked  (among  other  thinp)  how  confusing  they  found 
it  (free  response  rather  than  a  multi-choice  question),  and 
their  answers  were  coded  on  a  3  point  scale.  Those  coded 
as  'very  confusing'  were  counted  ss  indicating  that  the 
(nature  was  problematic.  Similarly  thoae  who  said  they  still 
had  not  learnsd  about  it  were  coded  as  not  knowing  the 
command. 

We  can  divide  the  subjects  along  three  independent 
dimensions.  The  first  is  by  method  of  administration:  'hot', 
'cold*,  and  "uitra-coid"  with  30,  29,  and  13  rntpoctivtiy  to 
give  a  total  of  72.  The  second  is  by  amount  of  experience  of 
this  editor  (in  our  case  this  coincides  with  the  distinction 
between  proctor  and  student).  The  third  dimension  applies 
to  a  subset  of  37  for  whom  we  had  information  on  whether 
they  bad  experience  of  any  other  editor  -  20  of  these  did, 
and  17  did  not.  We  looked  at  the  ordering  of  questions  in 
terms  of  how  widely  they  were  known  and  of  probletnatidty 
for  each  sub-group  separately. 

Excluding  "rap’,  'set*,  and  'find*,  the  3  most  prob¬ 
lematic  commands  were  'replace',  'copy*  and  'delate*  in 
every  group  on  the  "hot*  to  'uitra-coid*  dimension,  la  tht 
other-editor  experience  group,  'quit*  came  third  with 
'delete'  fourth  ('quit*  is  the  fourth  moat  problematic  in  the 
population  as  a  whole).  If  the  ditactioa  parameter  prob- 
lamaticity  is  compared  to  other  commands  it  comes  last  in 
every  group.  This  does  not  simply  indicate  too  insensitive  a 
measure  since  the  actual  numbers  (percent  of  users  finding 
it  problematic)  varies  from  group  to  group  (e.g.  4.7%  to  13% 
for  hot  and  cold)  roughly  according  to  tha  levels  of  the  least 
problematic  command  for  that  group  (11%  and  32%). 
Although  this  study  gives  no  information  about  how  to  scale 
this  measure  against  the  other  one,  it  suggests  that  the  two 
types  of  question  have  similar  properties. 

There  were  some  differences  between  the  groups  in 
the  range  (discriminative  power)  of  their  group  judgements 
of  problematicity  The  hot  group  ranged  over  73  percentage 
points,  while  the  cold  ranged  over  33.  and  the  uitra-coid 
over  77  The  group  having  more  experience  of  the  Pascal 
editor  i the  proctors)  ranged  over  91.  while  those  with  less 
ranged  over  62.  The  group  with  experience  of  another  edi¬ 
tor  ranged  over  78  points,  while  those  without  ranged  over 
TJ. 

Overall  tbc  ’general*  type  of  questions  produced  118 
remarks  from  33  individuals  out  of  a  total  population  of  72, 
and  thes*  represented  66  distinct  novel  comments  (i.e.  some 
individuals  made  essentially  the  same  poiac  as  eaca  other) 
Two  measures  seem  of  interest:  the  percentage  of  subjects 
who  provided  free  responses  to  these  questions,  and  the 
number  of  intelligible,  relevant  and  aon-redundant 
responses  gleaned  (since  some  responses  dealt  with  matters 
raised  in  other  questions).  The  latter  measure  represents 
the  number  of  distinct  items  brought  to  the  designer's 
attention  as  a  result  of  the  general  questions.  For  the  group 
as  a  whole,  these  measures  were  thus  70%  and  66.  In  com¬ 


paring  subgroups  tbe  number  of  useful  responses  per 
member  of  the  group  is  also  relevaot.  Tbe  method  of 
administration  produced  a  substantial  difference.  Tbe 
ultra-cold  group  were  tbe  least  helpful  (only  38%  provided 
say  response),  presumably  reflecting  a  carry-over  from 
reluctance  to  do  the  optional  editor  task  to  a  reluctance  to 
provide  free  responses,  and  yielding  only  0.46  useful 
repooses  per  person.  On  the  other  band  tbe  cold  group  bad 
a  response  rate  of  86%  giving  1.86  useful  responses  per  per¬ 
son,  wbiln  the  hot  group  with  73%  response  rate  gave  136 
useful  responses  per  person.  Both  kinds  of  experience  had 
a  big  effect  on  the  number  of  useful  responses:  the  group 
with  morn  experience  with  this  editor  (proctors)  gave  2.18 
useful  responses  per  person,  while  the  group  with  less  gave 
1.26;  and  tht  group  with  experience  of  other  editors  1.93 
while  those  without  gave  0.76  tesponses  per  person. 

We  looked  at  the  respooses  of  a  subset  of  the  subjects 
to  questions  about  3  features  of  so  enhanced  version  of  the 
editor  before  and  after  they  had  experience  with  it,  to 
examine  tha  usefulness  of  asking  respondents  about  whether 
they  would  like  some  enhancement.  The  correlation  of  a 
group's  responses  before  and  after  actual  experience  pro¬ 
vided  the  measure  of  this.  Tbe  correlations  for  all  groups 
were  low  and  with  apparently  no  meaningful  variation,  the 
correlation  for  the  whole  population  being  033.  Informally 
it  seems  that  there  is  more  consensus  between  groups  on 
post-experience  than  on  pre-experienct  and  that  within  a 
given  group  there  is  a  trend  to  predicting  about  the  tame 
usefulness  for  all  3  features,  while  tbe  post-experience 
scores  very  much  more. 

Our  questionnaire  afforded  two  opportunities  (or 
internal  consistency  checks.  In  one  we  compared  subjects' 
answers  about  ‘zap’  on  the  checklists  to  their  answen  to  the 
specific  question  about  its  usefulness.  2fi%  percent 
responded  inconsistently  that  they  did  not  know  tbe  com¬ 
mand  but  gave  an  opinion  on  its  usefulness.  (Both  bot  and 
cold  groups  separately  showed  essentially  the  same  con¬ 
sistency.)  In  another  check  we  looked  at  whether,  if  a  sub¬ 
ject  identified  a  command  as  problematic  in  answer  to  a 
general  question,  they  had  also  done  so  oo  tbe  checklist. 
Tbe  group  as  a  whole  was  83%  consistent  by  this  measure. 
Only  the  small  proportion  of  subjects  who  gave  a  response 
to  a  general  question  that  could  be  checked  in  this  way  ) 

Conclusion* 

Our  results  suggest  that  asking  users  about  ibe  value  of 
some  proposed  change  without  giving  them  experience  of  n 
is  an  essentially  useless  guide  to  their  satisfaction  with  it  in 
practice.  On  tbe  other  hand  our  results  suggest  that 
checklist-style  questions  about  specific  existing  features  of  a 
system  do  yield  finding)  that  are  robust  across  methods  of 
questionnaire  administration  and  across  the  amouot  of  user 
experience,  and  which  are  reasonably  consistent  within  sub¬ 
jects.  As  far  as  they  go,  tbe  results  support  tbe  idea  that 
comparable  aoo-cbecklist  specific  questions  have  tbc  same 
properties.  The  robust  results  from  specific  questions  gave 
an  ordering  of  commands  in  terms  of  bow  little  known  they 
were  by  users,  which  is  important  since  an  unknown  com¬ 
mand  is  of  no  practical  use  o  an  interface.  They  also 


12 


Questionnaires  as  a  Software  Evaluation  Tool 


yielded  an  ordering  ot  (rouble  spots  in  (be  existing  system  in 
terms  ot  s  measure  of  user  dissatisfaction,  together  with  an 
estimate  of  its  magnitude  that  could  be  used  as  a  guide  for  a 
software  engineer  organizing  further  work  on  the  system's 
interface.  The  non-checklist  specific  questions  addressed  to 
existing  editor  features  showed  results  similar  to  (he  check¬ 
lists. 

The  internal  consistency  checks  we  were  able  to  do 
showed  that  while  consistency  was  high  enough  not  to 
vitiate  the  usefulness  of  the  results,  it  was  certainly  not 
100%  by  any  measure.  This  indicates  the  need  to  build  con¬ 
sistency  checks  into  any  questionnaire  instrument  in  order 
to  keep  track  of  the  quality  of  the  responses.  This  can  be 
conveniently  done  by  using  different  question  formats  refer¬ 
ring  to  the  same  item,  so  that  while  some  combinations  ol 
response  arc  ioconsisteot  (which  yields  the  consistency 
check)  others  give  new  information  about  the  item. 

The  different  methods  of  administration,  'hot',  'cold* 
and  'ultra-cold',  did  appear  to  make  some  difference  to  the 
discriminability  (range  of  responses)  but  did  not  greatly 
alter  the  rank  ordering  of  which  commands  were  least 
known  aor  of  the  troublesomeness  of  commands.  Thus  we 
may  tentatively  conclude  that  method  of  administration  may 
affect  the  informativeness  of  results,  in  the  sense  that  the 
'colder*  the  conditions  the  less  likely  that  any  definite 
information  will  emerge,  but  that  it  docs  not  bias  results  in 
the  direction  of  criticizing  different  commands.  It  seems 
therefore  that  questionnaire  administration  should  perhaps 
be  arranged  as  much  as  possible  toward  'hot'  conditions, 
where  the  user  has  as  fresh  experience  of  using  the  system 
at  possible. 

The  effect  of  the  respondent’s  experience  appears  to 
be  similar  -  it  changes  the  discriminability  of  the  responses, 
but  not  the  overall  results.  In  our  study  this  was  true  of 
experience  with  the  editor  in  question,  and  of  experience  of 
other  editors. 

It  seems  then  that  the  most  successful  question  type  is 
the  checklist  which  will  give  a  list  of  areas  needing  atten¬ 
tion.  However  n  can  only  be  applied  to  existing  features  of 
the  editor  -  to  a  designer  s  sins  of  commission.  A  general 
evaluation  tool  must  also  address  the  problem  of  feathering 
information  on  sms  of  omission  (features  that  should  be 
added).  General  (i  e  open-ended  questions)  might  do  this. 
This  study  did  provide  evidence  that  such  questions  do  yield 
useful  responses,  which  are  affected  by  the  factors  that 
affect  the  specific  questions.  However  a  further  study, 
currently  being  planned,  will  be  aceded  to  discover  how 
effective  such  questions  are:  for  instance  if  one  user  com¬ 
plains  about  a  given  topic  how  many  users  would  have  com¬ 
plained  if  asked  directly  in  a  specific  question? 

We  began  the  present  study  with  the  view  that  there 
might  be  an  important  difference  in  effectiveness  between 
different  question  types,  conceiving  of  these  differences  in 
terms  of  checklist  versus  specific  versus  general.  The  results 
suggest  that  a  more  important  distinction  is  between  ques¬ 
tions  addressed  to  existing  features  that  the  subjects  there¬ 
fore  have  experience  of,  and  questions  about  proposed  new 


features  (no  matter  how  specific)  that  (hey  cannot  have  had 
experience  of.  (This  was  evident  in  the  results  on  the 
predictive  power  of  sublets,  and  also  in  the  effect  of 
experience  on  ibe  number  of  useful  responses  to  general 
questions.)  This  view  suggests  that  while  general  questions 
might  be  useful  for  eliciting  criticisms  of  existing  aspects  of 
tbe  editor  not  mentioned  in  specific  questions,  they  will 
only  be  useful  in  identifying  tins  of  omission  if  tbe  subjects 
have  experience  of  other  comparable  systems  (editors  in  this 
case)  as  a  basis  for  claiming  tbe  desirability  of  adding  some 
feature.  This  therefore  predicts  a  particular  effect  of  one 
kind  of  experience  oa  the  usefulnesx  of  a  particular  kind  of 
general  question  (concerning  sins  of  omission)  -  again  a 
further  study  will  be  needed  to  address  this. 
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If  the  field  of  Humus  Factors  ia  Computer  System*  is  to  be  t  success  it  must  develop  design 
principles  this  sre  useful,  phectples  tfitt  spply  sense  s  wide  raage  of  technologies.  la  the 
fires  part  of  this  paper  I  diacusa  some  the  properties  that  useful  priadptes  should  have.  While 
1  am  at  it.  1  want  of  the  dangers  of  the  tar  pits  sad  the  sirens  of  technology.  We  cannot  avoid 
these  dangers  entirely,  for  were  we  to  do  so,  we  would  fail  to  cope  with  the  real  problems  and 
hazards  of  the  field. 


The  second  pan  of  the  paper  is  intended  to  illustrate  the  first  pan  through  the  example  of  tra¬ 
deoff  analysis.  Aay  single  design  technique  is  apt  to  have  its  vsnucs  along  one  dimension 
compensated  by  deficiencies  along  another.  Tradeoff  analysis  provides  s  quantitative  method 
of  assessing  tradeoff  relations  for  two  attributes  i,  and  z,  by  first  determining  the  Uter  Satis- 
faction  function  far  each.  U(x),  then  showing  how  U  (x,)  trade*  oft  against  U  (z, ).  In  general, 
the  User  Satisfaction  for  a  system  is  given  by  the  weighted  sum  of  the  User  Satisfaction  values 
for  the  attributes.  Tha  analysis  is  used  to  examine  two  different  tradeoffs  of  information 
versus  time  and  editor  workspace  versus  menu  six*.  Tradeoffs  involving  command  languages 
versus  menu-baaed  systems,  choices  of  names,  and  handheld  computers  versus  workstations  are 
examined  briefiy. 


If  we  intend  a  science  of  human -computer  interaction, 
it  is  essential  that  we  have  principles  from  which  to  derive 
the  manner  of  the  interaction  between  person  and  com¬ 
puter.  It  is  easy  to  devise  experiments  to  test  this  idea  or 
that,  to  compare  and  contrast  alternatives,  or  to  evaluate 
the  quality  of  the  latest  technological  offering.  But  we  must 
aspire  to  more  tbao  responsiveness  to  the  current  aeed. 
The  technology  upon  which  the  human-computer  interface 
is  built  changes  rapidly  relative  to  the  time  with  which 
psychological  experimentation  yields  answers.  If  we  do  not 
take  care,  today  s  answers  apply  only  to  yesterday's  con¬ 
cerns. 

Our  design  principles  must  be  of  sufficient  generality 
chat  they  will  outlast  the  technological  demands  of  the 
moment.  But  there  is  a  second  and  most  important  cri¬ 
terion:  the  principles  must  yield  sufficiently  precise  answers 
that  they  can  actually  be  of  use:  Statements  that  proclaim 

To  be  published  in:  Proceedings  of  the  CHI  1983  Conference 
on  Hunan  Factors  in  Composer  Systems.  Boston:  December, 
1983. 


'Consider  the  user*  are  valid,  but  worthless.  We  need  more 
precise  principles. 

This  new  field  —  Human  Factors  in  Computer  Systems 
—  contains  an  unruly  mixture  of  theoretical  issues  and  prac¬ 
tical  problems,  fust  aa  it  i*  important  that  our  theoretical 
concerns  have  breadth,  generality,  and  usability,  so  too  ia  it 
important  that  we  understand  the  practical  problems.  We 
are  blessed  with  an  exciting,  rapidly  developing  technology 
that  is  controlled  through  the  time  consuming  and  addictive 
procedure  called  programming.  There  are  traps  for  the 
unwary:  let  me  tell  you  about  them. 

Tar  Pits  and  Sirens  of  Technology 

As  with  most  unexplored  territories,  dangers  await:  tar 
pits  and  sirens.  The  former  lie  hidden  in  the  paih,  ready  to 
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with  ibe  Personnel  and  Training  Research  Programs  of  the  Office  of  Na¬ 
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trap  the  unwary.  The  latter  stud  openly,  luring  their  prey 
to  destruction  with  bewitching  sweetness.  1  see  too  many  of 
you  trapped  by  one  or  the  other. 

To  program  or  noc  to  program,  that  is  the  question. 
Whether  it  is  nobler  to  build  systems  or  to  remain  pure, 
arguing  for  abstract  principles  independent  of  the  technol¬ 
ogy.  8uild  systems  and  you  face  the  tar  pits,  writing  pro¬ 
grams  whose  sole  justification  is  to  support  the  writing  of 
programs,  eating  up  work-years,  eating  up  resources,  forever 
making  ‘one  last  improvement.*  When  you  finish,  others 
may  look  and  nod,  saying,  ‘yes,  how  dever.*  But  will  any¬ 
thing  general  be  learned?  Will  the  non  technological  leap 
pass  it  by?  Programming  can  be  a  pit  that  grabs  the  unwary 
end  bolds  them  down.  While  in  the  pit  they  may  struggle 
and  attract  attention.  Afterwards,  there  may  be  no  visible 
trace  of  their  passing. 

Alternatively,  you  may  be  seduced  by  the  sirens  of 
technology.  High  resolution  screens,  color,  three- 
dimensions,  mice,  eye-movement  detectors,  voice-in,  voice- 
out,  touch-in.  feders-out;  you  name  it,  it  will  happen. 
Superficial  pleasure,  but  not  necessarily  any  lasting  result. 
What  general  lessons  will  have  been  learned? 

Damned  if  you  do,  damned  if  you  don't.  The  pure  in 
heart  will  avoid  the  struggles,  detour  the  tar  pin.  blind  their 
eyes  to  the  sirens.  *We  want  general  principles  that  are 
independent  of  technology.'  they  prod  aim.  But  then  what 
should  they  study?  If  the  studies  are  truly  independent  of 
the  technology,  they  are  apt  to  haws  little  applicability. 
How  can  you  develop  useful  principles  unless  you  under¬ 
stand  tba  powers  and  weaknesses  of  the  technology,  the 
pressures  and  cook  runts  of  rail  design?  Study  s  general 
problem  such  as  the  choice  of  editor  commands  and  some¬ 
one  will  develop  a  new  philosophy  of  editing,  or  a  new  tech¬ 
nological  device  that  makes  the  old  work  irrelevant.  The 
problem  is  that  to  avoiding  the  paths  that  contain  the  tar, 
you  may  never  reach  any  destination;  in  avoiding  tempta¬ 
tion.  you  remain  pure,  but  irrelevant.  Life  is  tar  pits  and 
sirens.  Real  design  of  real  systems  is  filled  with  the  messy 
constraints  of  life:  time  pressures,  budget  limitations,  a  lack 
of  information,  abiiities,  and  energy.  We  are  apt  not  to  be 
useful  unless  w e  understand  these  constraints  and  provide 
tools  that  can  succeed  despite  tbem.  or  better,  that  can  help 
alleviate  them,  experimental  psychology  is  aot  noted  for  its 
contributions  to  life;  the  study  of  human-computer  interface 
should  be. 

Four  Siraitgm  for  Providing  Design  Pnneipits 

What  can  we  accomplish?  One  thing  that  is  needed  is 
a  wr.y  of  introducing  good  design  principles  into  the  design 
stage.  How  can  w«  do  this?  Let  me  mention  four  ways. 

1:  Try  to  impress  upon  the  designer  the  seriousness 
of  the  matter,  to  develop  an  awareness  that  users 
of  systems  have  special  needs  that  must  be  taken 
account  of.  The  problem  with  this  approach  is 
that  although  such  awareness  is  essential,  good 
intentions  do  not  necessarily  lead  to  good  design. 


Designers  need  to  know  what  to  do  and  how  to 
do  it. 

2:  Provide  methods  and  guidelines.  Quantitative 

methods  are  better  than  qualitative  ones,  but  all 
are  better  than  none  at  all.  These  methods  and 
guidelines  must  be  usable,  they  must  be 
justifiable,  they  must  have  face  validity.  The 
designer  is  apt  to  be  suspicious  of  many  of  our 
intentions.  Moreover,  unless  we  have  worked 
our  these  guidelines  with  skill,  they  will  be  use¬ 
less  when  confronted  with  the  realities  of  design 
pressures.  The  rules  must  aot  only  be  justified 
by  reasonable  criteria,  they  must  also  appear  to 
be  reasonable:  designers  are  aot  apt  to  care 
about  the  discussions  m  the  theoretical  journals. 

3:  Provide  software  tools  for  interface  design.  This 

can  be  a  major  positive  force.  Consider  the 
problem  of  enforcing  consistent  procedures 
across  all  components  of  a  system.  With 
appropriate  software  tools,  consistency  can  be 
enforced,  if  only  because  it  will  be  easier  to  use 
the  tools  rather  than  to  do  without  tbem.  We 
can  ensure  reasonable  design  by  building  the 
principles  into  the  tools. 

a:  Separate  tbe  interface  design  from  other  pro¬ 

gramming  tasks.  Make  the  interface  a  separate 
data  module,  communicating  with  programs  and 
the  operating  system  through  a  standardized 
communication  channel  and  language.  Interface 
design  should  be  its  own  discipline,  for  it 
requires  sophistication  in  both  programming  and 
human  behavior.  If  we  had  the  proper  modulari¬ 
zation,  then  the  interface  designer  could  modify 
the  interface  independently  of  tbe  rest  of  tbe  sys¬ 
tem.  Similarly,  many  system  changes  would  not 
require  modification  of  tbe  interface.  Tbe  ideal 
method  would  be  for  software  tools  to  be 
developed  that  can  be  used  m  tbe  interface 
design  by  non-programmers.  I  imagine  the  day 
when  I  can  self-tailor  my  own  interlace,  carrying 
the  specification  around  on  a  micro-chip  embed¬ 
ded  m  a  plastic  card.  Walk  up  io  any  computer 
terminal  n  ibe  world,  insert  my  card,  and  *oiia, 
it  is  my  personalized  terminal. 

I  recommend  that  we  move  toward  all  of  these  things  I 
have  ordered  the  list  m  terms  of  my  preferences:  last  being 
most  favortd;  first  ocmg  easiest  and  most  likely  today. 
Each  is  difficult,  each  requires  work. 

There  has  been  progress  towards  the  development  of 
appropriate  design  methods.  One  approach  is  demonstrated 
through  the  work  or  Card,  Moran,  and  Newell  |19S3)  who 
developed  formal  quantitative  methods  of  assessing  a  design 
Their  techniques  provide  tools  for  the  second  of  my  sug¬ 
gested  procedures.  Card,  Moran,  and  Newell  emphasize  the 
micro-processes  of  interaction  with  a  computer  —  for  exam¬ 
ple,  analysis  at  the  level  cf  keystrokes.  At  UCSD.  »t  are 
attempting  to  develop  other  procedures.  In  tbe  end,  tbe 
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field  will  need  many  metbodi  and  guidelines,  each  comple¬ 
menting  and  supplementing  the  others.  Let  me  now 
describe  briefly  the  approach  that  we  are  following,  then 
present  one  of  our  techniques  —  the  tradeoff  analysis  —  in 
detail. 

The  L'CSD  User  Centered  System  Design  Project 

At  L’CSD  we  have  a  large  and  active  group  attempting 
to  put  our  philosophy  into  practice.  Our  goal  is  to  have 
pure  heart  and  clear  mind,  even  while  feet  and  loins  are  in 
tar  and  temptation.  Some  of  our  initial  activities  are  being 
presented  in  this  conference:  Bannon,  Cypher,  Greenspan, 
and  Monty  (1983):  O'Malley,  Smolensky,  Bannon,  Conway, 
Graham.  Sokolov,  and  Monty  (1983);  Root  and  Draper 
(1983).  Other  examples  have  been  published  elsewhere  or 
are  still  undergoing  final  development  (Norman,  1983a,  b,  c). 

The  principles  that  we  follow  take  the  fora  of  state¬ 
ments  plus  elaboration,  the  statements  becoming  slogans 
that  guide  the  research.  The  primary  principle  is  summar¬ 
ised  by  the  slogan  that  has  become  the  name  of  the  project: 
User  Centered  System  Design.  The  slogan  emphasizes  our 
belief  that  to  develop  design  principles  relevant  to  building 
human-machine  interfaces,  it  it  necessary  to  focus  on  the 
user  of  the  system.  This  focus  leads  us  naturally  to  a  set  of 
topics  and  methods.  It  means  we  must  observe  how  people 
make  use  of  computer  systems.  It  brinp  to  the  fore  the 
study  of  the  mental  models  that  users  form  of  the  systems 
with  which  they  interact.  This,  in  turn,  leads  to  three 
related  concepts:  the  designer’s  view  of  the  system  —  the 
conceptual  medal;  the  image  that  the  system  presents  to  the 
user  —  the  system  image;  and  third,  the  mental  medal  the 
user  develops  of  the  system,  mediated  to  a  large  extent  by 
the  system  image.  We  believe  that  it  is  the  task  of  the 
designer  to  establish  a  conceptual  image  of  the  system  that 
is  appropriate  for  the  task  and  the  class  of  users,  then  to 
construct  the  system  so  that  the  system  image  guides  the 
user  to  acquire  a  mental  model  that  matches  the  designer's 
conceptual  model. 

The  Slogans 

There  are  five  major  slogans  that  guide  the  work: 

a  There  are  no  simple  answers,  only  tradeoffs. 

•  There  are  no  errors:  all  operations  are  iterations 
towards  a  goal. 


Alt  operations  are  iterations  towards  a  goal.  A 
second  theme  is  that  all  actions  of  users  should  be  con¬ 
sidered  as  part  of  their  attempt  to  accomplish  their  goals. 
Thus,  even  when  there  is  an  error,  it  should  be  viewed  as  an 
attempt  by  the  user  to  get  to  the  goal.  Typing  mistakes  or 
illegal  statements  can  be  thought  of  at  an  approximation. 
The  task  for  the  designer,  then,  is  to  consider  each  input  as 
a  starting  point  and  to  provide  tppropriate  assistance  to 
allow  efficient  modification.  In  this  way,  we  aid  the  user  in 
rapid  convergence  to  the  desired  goal.  An  important  impli¬ 
cation  of  this  philosophy  is  that  the  users'  iotentions  be 
knowable.  In  some  cases  we  believe  this  can  be  done  by 
having  the  users  state  intentions  explicitly.  Because  many 
commands  confound  intentions  and  actions,  intentions  may 
substitute  for  commands. 

Low  level  protocols  are  critical.  By  ‘protocol,*  we 
mean  the  procedures  to  be  followed  during  the  conduct  of  a 
particular  action  or  session,  this  meaning  being  derived 
from  the  traditional  meaning  of  protocol  as  *a  code  of 
diplomatic  or  military  etiquette  or  precedence.*  Low  level 
protocols  refer  to  the  actual  operations  performed  by  the 
user  —  button  pushes,  keypresses,  or  mouse  operation  — 
and  these  permeate  the  entire  use  of  the  system.  If  these 
protocols  can  be  made  consistent,  then  a  major  standardiza¬ 
tion  takes  place  across  all  systems. 

Activities  are  structured.  User  actions  have  an  impli¬ 
cit  grouping  corresponding  to  user  goals;  these  goals  may  be 
interrelated  in  various  ways.  Thus,  a  subgoal  of  a  task  is 
related  to  the  main  task  in  a  different  way  than  is  a  diver¬ 
sion.  although  both  may  require  temporary  cessation  of  the 
main  task,  the  starting  up  of  new  tasks,  and  eventual  return 
to  the  main  one.  We  believe  the  grouping  of  user  goals 
should  be  made  explicit,  both  to  the  user  and  to  the  system, 
and  that  doing  so  will  provide  many  opportunities  for 
improved  management  of  the  interaction.  For  example,  the 
system  could  constrain  interpretation  of  user  inputs  by  the 
context  defined  by  the  current  activity,  the  system  could 
remind  turn  of  where  they  are  as  they  progress  through  a 
collection  of  tasks,  or,  upon  request,  it  could  provide 
suggestions  of  how  to  accomplish  the  current  task  by  sug¬ 
gesting  possible  sequences  of  actions.  The  philosophy  is  to 
structure  activities  and  actions  so  that  the  users  perceive 
themselves  as  selecting  among  a  set  of  related,  structured 
operations,  with  the  set  understood  and  supported  intelli¬ 
gently  by  the  system  (see  Bannoo,  Cypher.  Greenspan,  A 
Monty,  1983). 


•  Low  level  protocols  are  critical, 

e  Activities  are  structured. 

e  Information  retrieval  dominates  activity. 

There  art  no  simple  answers,  only  tradeoffs.  A  cen¬ 
tral  theme  of  our  work  is  that;  io  design,  there  are  no 
correct  answers,  only  tradeoffs.  Each  application  of  s 
design  principle  has  its  strengths  and  weaknesses;  each  prin¬ 
ciple  must  be  interpreted  in  a  context.  One  of  our  goals  is 
to  make  the  tradeoffs  explicit.  Tbit  point  will  be  the  topic 
of  the  second  half  of  the  paper. 


Information  retrieval  dominates  activity.  Using  a 
computer  system  involves  stages  of  activities  that  include 
forming  an  intention,  choosing  an  action,  specifying  that 
action  to  the  system,  and  evaluating  the  outcome.  These 
activities  depend  heavily  upon  the  strengths  and  weaknesses 
of  human  short-  and  long-term  memory.  This  means  that 
we  place  emphasis  upon  appropriate  design  of  file  and  direc¬ 
tory  structures,  command  'workbenches,*  and  the  ability  to 
get  information,  instruction,  and  help  on  the  different 
aspects  of  the  system.  We  are  studying  vnriout  represents- 
tional  structures,  including  semantic  networks,  schema 
structures  of  both  conventional  and  'additive  memory* 
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fora,  browtcn,  hyper-text  structures,  sad  other  retrieval 
aids  (see  O'Malley,  Smolensky,  Baasoa,  Conway,  Graham, 
Sokolov  A  Monty.  1983). 

A  Desesonstretion  System 

This  is  where  we  traverse  the  tar  pits.  We  feci  it 
csm alia!  chat  our  ideas  he  tested  w ithia  a  working  system, 
not  only  because  we  fed  that  the  real  constraints  of 
developing  a  full,  useable  system  are  important  desip  con¬ 
siderations  that  must  be  faced,  bet  also  because  we  believe 
that  full  evaluation  can  only  take  place  within  the  bounds 
of  a  complete,  working  environment.  Therefore,  we  intend 
to  construct  a  teat  and  demonstration  system  based  around  a 
modern  workstation  using  the  UNIX  operating  system. 
UNIX  wes  chosen  because  it  provides  s  rich,  powerful 
operating  environment.  However,  because  UNIX  was 
desiped  for  the  professional  programmer,  unsophisticated 
users  have  great  trouble  with  it,  prodding  a  rich  set  of 
opportunities  for  our  research. 

Althoup  we  intend  that  our  desip  principles  will  be 
applicable  to  any  system  regardless  of  the  particular 
hardware  being  used,  many  of  the  coocepts  are  effective 
oaiy  on  hip-resolution  displays  that  allow  multiple  win¬ 
dows  on  the  screen  and  that  use  simple  pointing  devices. 

'These  displays  allow  for  a  considerable  improvement  in  the 
desip  of  human-computer  interfaces.  We  see  ao  choice  bus 
to  brave  the  sirens  of  technology.  The  capabilities  of  the 
hardware  factor  into  the  tradeoff  relationships.  We  intend 
the  demonstration  systems  to  show  how  the  tradeoffs  in 
design  choices  interact  with  the  technology. 

Tradeoffs  in  Design 

Now  let  us  examine  one  of  our  proposals  —  tradeoffs 
—  as  a  prototype  of  a  quantitative  desip  rule.  It  is  well 
known  that  different  tasks  and  daises  of  users  have 
different  needs  and  requirements.  No  single  interface 
method  can  satisfy  all.  Any  single  desip  technique  is  apt  to 
have  m  virtues  along  one  dimension  compensated  by 
deficiencies  along  another.  Each  technique  provides  a  set  of 
tradeoffs.  The  design  choices  depend  upon  thn  technology 
being  used,  the  class  of  users,  the  goals  of  the  design,  and 
which  aspects  of  interface  should  gain,  which  should  lose. 
This  focus  on  the  tradeoffs  emphasizes  that  tht  desip  prob¬ 
lem  must  be  looked  at  as  a  whole,  not  in  isolated  pieces,  for 
the  optimal  choice  for  one  pan  of  the  problem  will  probably 
not  be  optimal  for  aaotber.  According  to  this  view,  there 
are  no  correct  answers,  only  tradeoffs  among  alternatives. 

The  Prototypical  Tradeoff :  Information  Versus  Time 

One  basic  tradeoff  pervades  many  desip  issues: 

factors  that  increase  inf drmaswenest  tend  to  decrease  the 
amount  of  available  workspace  and  system  retponsiveness. 

On  tbe  one  hand,  the  more  informative  and  complete  the 
display,  the  more  useful  when  tbu  user  has  doubts  or  lacks 
understanding.  On  the  other  hand,  the  more  complete  tbe 
display,  the  longer  it  takes  to  be  displayed  and  the  more 


space  it  must  occupy  physically.  This  tradeoff  of  amount  of 
information  versus  specs  and  time  appears  to  many  pises 
and  is  one  of  the  major  interface  issues  that  must  be  ban¬ 
died.  To  appreciate  its  importance,  one  baa  only  to  examine 
a  few  recent  commercial  offerings,  highly  touced  for  their 
innovative  (and  impressive)  human  factors  desip  that  were 
intended  to  make  the  system  easy  and  pleasurable  to  use, 
but  which  so  degraded  system  response  time  that  serious 
user  complaints  resulted. 

It  is  often  stated  that  current  computer  systems  do  not 
provide  beginning  users  with  sufficient  information.  How¬ 
ever,  the  long,  informative  displays  or  sequence  of  ques¬ 
tions,  options,  or  menus  that  may  main  a  system  usable  by 
the  beginner  are  disruptive  to  tbe  expert  who  knows  exactly 
what  actioo  is  to  be  specified  and  wishes  to  minimise  tbe 
time  and  mental  effort  required  to  do  tbe  specification.  We 
pit  the  expert's  requirement  for  ease  of  specification  against 
thn  beginner's  requirement  for  knowledge. 

I  approach  this  problem  by  tackling  tbe  following 
questions: 

•  How  can  we  specify  the  pin  in  user  satisfaction 
that  results  from  increasing  the  size  of  a  menu; 

•  How  do  we  specify  the  user  satisfaction  for  tbe 
size  of  the  workspace  in  a  text  editor; 

•  How  can  we  specify  the  loss  in  user  satisfaction 
from  the  increase  in  time  to  generate  the  display 
and  decrease  in  available  workspace; 

e  How  can  we  select  menu  sue,  workspace,  and 
response  time,  when  each  variable  affects  the 
others? 

I  propose  thas  we  answer  the  question  by  use  of  a  psycho¬ 
logical  measure  of  User  Satisfacuon.  This  allows  us  to 
determine  the  impact  of  changing  physical  parameters  upon 
the  psychotojicat  variable  of  user  satisfaction.  Once  we 
know  how  each  dimension  of  choice  affects  user  lansfac- 
non,  then  we  can  directly  assess  tbe  tradeoffs  among  the 
dimensions. 

Example:  Menu  Size  and  Display  Time 

Let  U(c),  tbe  user  satisfaction  for  attribute  a.  be  given 
by  a  power  function.  U(j)  •  **'  (In  Norman,  1983c.  I  give 
more  details  of  tbe  method.  See  Stevens.  1974,  tor  a  review 
of  tbe  power  funettoo  in  Psychology )  For  (he  examples  io 
this  paper  l  used  tbe  method  of  magnitude  production  to 
estimate  parameters. 

User  preference  for  menu  site.  The  preferred 
amount  of  information  must  vary  with  tl.c  task,  but  infor¬ 
mal  experiments  with  a  variety  of  menus  and  tasks  suggest 
that  for  many  situations,  about  300  characters  is  reasonable: 

[  assigned  it  a  satisfaction  value  of  SO.  This  is  the  size  menu 
that  can  be  requested  (or  our  laboratory's  computer  mail 
program  ('msg').  It  serves  as  a  reminder  for  36  single-letter 
mnemonic  commands.  To  do  tbe  power  function  estimates. 
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Menu  Size  (characters)  U(DIsplay  Time) 

Figure  l.  Tradeoff  o(  mean  ue  (or  display  tine.  Panels  A  tad  B  shoo  User  Satisfaetios 
for  menu  sin.  1/(5)  -  l.M“,  ted  display  tine,  U(T)  -  lOOT"',  respectively.  Panel  C  shows 
display  time  aa  a  (unction  of  mean  sin,  T  •  5  /fi,  for  different  values  of  display  rate,  fi 
(speeiiled  in  characters  htcaad ).  Panel  D  shoes  the  tradeoff  between  t/(5)  and  l/(T)  (or 
different  values  of  display  rate  (p). 

I  examined  a  variety  of  menus  of  different  sizes  for  the  mes*  and  T ,  lets  us  determine  the  tradeoff  between  User  Satisfac- 

sage  system  (thereby  keeping  the  task  Che  same).  I  (too  for  size  of  the  menu  and  (or  time  to  display  the  infor- 

estimated  that  the  menu  size  would  have  to  increase  to  half  mation:  1/(5)  versus  U(T).  If  e«5/p,  then  we  can  prob- 

the  normal  video  terminal  screen  (1000  characters)  in  order  ably  ignore  e,  letting  r  -  5/ p.  This  lets  us  solve  the  t ra¬ 
te  double  my  satisfaction.  This  is  a  typical  result  of  psycho-  deoff  exactly.  1  It  the  two  power  functions  are  given  by 

logical  scaling;  a  substantial  increase  of  the  current  value  is  t/(5)-aI'  and  t/(T)  -  vt* ,  then  t/(5)  »  ti/fry* ,  where 

required  to  make  the  increase  worthwhile.  If  (/(MO)  -  50  ‘  t  .  n,  tradeoff  relationship  using  the  parameters 

and  1/(1000)  «  100.  then  the  parameters  of  the  power  fuoc-  “ 

non  are  t  -  13  and  ,  -  0.6;  V  (5)  -  ij»s*‘.  estimated  for  menus  is  shown  in  Figure  1. 


User  preference  for  response  time.  There  already 
exists  tome  literature  oo  user  satisfaction  for  response  time: 
the  judgements  of  'acceptable*  response  times  given  by 
Shociderman  (1980,  p  228:  the  times  are  taken  from  Miller, 
1968).  The  times  depend  upon  the  task  being  performed. 
For  highly  interactive  tasks,  where  the  system  has  just 
changed  state  and  the  users  are  about  to  do  a  new  action, 
2.0  seconds  seems  appropriate.  I  determined  that  I  would 
be  twice  as  satisfied  with  a  response  time  of  1  second. 
Therefore,  U  (2  «e )  •  30  and  (/(!«*)•  100.  For  these 
values,  the  power  function  becomes 
U(.T)  -  100/T  (»  -  100,  -  -l). 

Sirs  of  menu  and  display  time.  We  need  one  more 
thing  to  complete  the  tradeoff  analysis:  the  relationship 
between  menu  size  (5)  and  the  time  to  present  the  informa¬ 
tion  (T).  In  general,  time  to  present  a  display  is  a  linear 
(uoction  of  S:T  -  e+S  /&,  where  5  is  measured  in  charac¬ 
ters,  fi  is  the  display  rate  in  characters  per  second  (cps)  and 
ct  is  system  response  time. 

The  tradeoff  of  menu  site  for  display  time. 
Knowledge  of  £/(5),  V(T),  and  the  relationship  between  5 


Maximising  Total  User  Satisfaction 

Let  overall  satisfaction  for  the  system,  U(tytiem),  be 
given  by  the  weighted  sum  of  the  ff(x,)  values  lor  each  of  its 
attributes,  x,:  U(rysttm)  «  £«»,t/(x,),  where  ■,  is  the  weight 
for  the  i-th  attribute.  When  there  are  only  two  attributes, 
x,  and  X],  if  we  hold  U (system )  constant  at  some  value  c,  we 

C  #y 

can  determine  the  iso- scaitf action  line:  t/(x,)  -  — (/(xj. 

•i  “t 

Thus,  the  iso-satisfaction  functions  appear  on  the  tradeoff 
graphs  as  straight  lines  with  a  slope  of  -wj/W|,  with  higher 
lines  representing  higher  values  of  U  {system). 

U  the  tradeoff  functions  are  concave  downward  (aa 
are  some  in  later  figures),  the  maximum  satisfaction  occurs 
where  the  slope  of  the  tradeoff  function  it  tangent  to  the 
iso-satisfaction  function:  that  is,  when  the  slope  of  the  tra¬ 
deoff  function  -  -w]/wt.  In  this  case,  maximum  satisfaction 
occurs  at  some  compromise  between  (be  two  variables. 


1,  Lxtiiflg  a  •  0  slmpliflw  ib«  tradeoff  ralatises.  bvt  this  is  eot  s  -irn 
ary  tetsapties.  tf  sysua  rsepeaa  tiaa  it  doe.  tbaa  a  should  b«  rcia- 
ttaiad:  lbs  trad  toff  eta  still  ba  ba  datermiead  qidta  ainpty. 
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U  the  tradeoff  functions  ui  concave  upward  (aa  in 
Figure  1),  then  the  minimum  satiafactioo  occurs  where  the 
iso- satisfaction  curves  are  fan  gent  to  the  tradeoff  fuactioaa. 
Maximum  satisfaction  occurs  by  maximum |  one  of  the  two 
attributes.  The  expert,  for  whom  ur/"t  »  1.  will  not 
ncrifica  time  for  a  menu.  The  beginner,  for  whom 
mr/m,  «  1,  will  sacrifice  display  time  in  order  to  get  as  big 
a  menu  as  possible.  For  intermediate  cases  between  that  of 
the  extreaM  expert  or  begiaaer,  the  optimum  solution  is  still 
either  to  maumua  menu  sue  or  to  auoimize  display  time, 
but  the  user  might  be  indifferent  as  to  which  of  these  two 
was  preferred.  These  coa elusions  apply  to  the  tradeoff 
fuaettoes  of  Figure  ID  regardless  of  display  rate,  u  long  as 
the  curves  are  concave  upward. : 

These  analyses  my  that  the  tradeoff  solution  that  one 
tends  to  think  of  first  —  to  compromise  between  time  and 
menu  sue  by  prcseaiiag  a  small  menu  at  some  medium 
amount  of  workspace  —  actually  provides  the  least  amount 
of  total  satisfaction.  Satisfaction  is  maattnimd  by  an  all  or 
none  solution.  The  all-or-aone  preference  applies  only  to 
tradeoff  functions  that  are  concave  upward,  such  ss  that 
between  menu  sue  and  display  time.  Later  we  shall  see  that 
when  time  is  not  relevant,  the  analyst  of  the  tradeoff 
between  menu  sue  and  workspace  predicts  that  even  experts 
will  sacrifice  some  workspace  for  a  mean. 

Workspace 

Available  workspace  refers  to  the  amount  of  room  left 
on  the  screen  after  the  menu  (or  other  information)  is 
displayed.  This  is  especially  important  where  the  menu 
slays  on  the  screen  while  normal  work  continues.  The  trn* 
dcoff  is  sensitive  to  screen  sue.  U  we  had  a  screen  which 
could  display  60  lines  of  text,  using  fi  linns  to  show  the 
current  state  of  the  system  and  a  small  menu  of  choices 
would  no<  decrease  usability  much.  But  if  (he  screen  could 
only  display  3  lines  at  a  time,  then  using  6  of  them  for  this 
purpose  would  be  quite  detrimental. 

User  prtf  trance  function  for  workspact.  The  user 
preference  function  for  workspace  dearly  depends  upon  the 
aature  of  the  task:  some  tasks  —  such  as  issuing  a  com* 
mand  —  may  only  require  a  workspace  of  1  line,  others  — 
such  as  file  or  text  editing  —  could  use  unlimited 
workspace.  Let  us  consider  the  workspace  preferences  for 
text  editing  of  manuscripts.  The  most  common  editors  can 
only  show  2*  lines,  each  of  30  characters:  1920  character 
positions.  I  let  tm920)  -  50.  To  estimate  the  workspace 
that  would  double  the  value,  1  imagined  working  with 
screen  editors  of  the  sues  showa  in  Table  1.  I  concluded 
that  1  would  need  the  sue  given  by  the  two  page  journal 


2.  Wbstbcr  «  tradeoff  function  uaibtu  upwtrd  at  dowawwd  csocaviiy 
depends  upee  ike  cboics  at  u»  vaiiif  scuoe  function.  It  both  functions 
vl  power  functions  with  one  exponent  ponuve  se4  lbs  other  acfkiive. 
■in  tradeoff  fuocnons  we  always  concave  upward.  When  both  espobtsis 
we  passive,  ibo  tradeoff  functions  are  always  concave  downward.  When 
tie  i wo  fudetioos  are  loganibnuc.  ibe  tradeoff  fuocsioos  ere  always  cos- 
saw  downward,  ud  wnca  they  we  hots  ispsuc.  ibe  tradeoff  functiooa 
ere  baa  concave  upward  sod  dowowe/d.  jwucbsb|  from  oee  to  the  other 
■  e  fuoetMO  of  me  oiber  variables  (e.g..  display  ratal.  Tbcsc  coaciuaoai 
bold  woes  ever  tae  two  variables  x,  sad  xj.  wo  liitarly  relatad. 


spread.  That  is.  t/( 6400)  >  100.  The  power  function  parame¬ 
ters  are  I  «  0M  and  p  •  Oh,  so  that  !/( w)  •  06*wM:  the 
tame  exponent  used  for  menu  size  but  with  a  different  scale 
factor,  k .  This  function  is  shown  in  Figure  2. 


Table  1 


SIZE  OF  COMMON  TEXTS  AND  DEVICES 
(in  characters) 

j  TEXT  OR  DEVICE 

APPROXIMATE  NUMBER 

OF  CHARACTER  POSmONS 

[  Portable  Computer 

"  520 

I 

1 

1 

100) 

! 

Home  Microprocessor 
(Apple//) 

“  960  j 

Standard  Video 

Display  Uail 

1,920 

One  rypad  manuscript 
;  pngs  (tomtit  spared) 

2.600 

i  One  typed  manuscripr 
'  pare  (unfit  reared) 

4.000  j 

ion  rati  page  (Caf#u- 
!  ttvt  Scumct) 

3.200 

! 

1  Double  pi(c  spread 

6.400 

Pif«  of  Proceedings 
{Cmtharibtrg  Human 

1500 

Factor,  in  Commutr 
Systems) 

| 

Double  pt|e  spread 

(1.000  1 

Newspaper  page  (Lai 
An, tit,  Time,) 

30.000 

Double  page  spread 

60.000 

Figure  2.  Panel  A.  User  Satisfaction  function  for  editor 
workspace,  w.  U(w)  *U.b4»J*  Tvptcal  c.nsractcr  nses  for  various 
displays  and  teats  are  also  mows.  Panel  0  mows  ibe  tradeoff  fuocitoo  of 
workspace  sfsmsc  time.  V 1 »  I  versus  t/l/V  for  diifereoi  display  rates. 
3  (enofoettrt /ittonO  I.  V  (T )  is  sbowe  m  Fleurs  .3:  6'iD  ■  '.OUT" 
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Trading  workspace  for  display  lime.  One  penalty  for 
inermsiag  the  tin  of  the  workspace  is  increased  time  to 
display  the  workspace.  If  we  use  the  same  User  Satisfaction 
function  for  time  as  in  Figure  IB,  we  get  the  tradeoff  func¬ 
tions  shown  in  Figure  28.  Large  workspaces  require  very 
high  display  rates  before  they  are  satisfactory.  Because 
these  tradeoff  functions  are  concave  upward  (and  are  very 
similar  to  the  functions  of  Figure  10),  the  same  conclusions 
apply  here  as  to  those  earlier  functions:  the  optimum 
operating  point  is  an  all-or-none  solution.  Thus,  the  user 
either  prefers  a  large  workspace,  regardless  of  the  time 
penalty,  or  a  very  fast  display,  regardless  of  the  workspace 
penalry.  Here,  however,  the  relative  weights  are  apt  to  be 
determined  by  the  task  rather  than  by  the  user's  level  of 
skill. 

Suppose  the  task  were  one  in  which  the  display 
changes  relatively  infrequently.  In  this  case  we  would 
expect  » «a-T.T  M ,  so  that  the  optimum  solution  is 

to  have  as  big  a  workspace  as  possible.  It  the  task  west  one 
that  requires  frequent  changes  in  the  display,  then  we 

would  expect  the  reverse  result:  -  . ,...  « u...r,T _ so  the 

optimum  solution  is  to  shrink  the  workspace  to  the  smallest 
sue  at  which  the  task  can  still  be  carried  out,  thereby 
minimizing  display  time. 

Trading  workspace  for  menu  size.  Adding  a  menu  to 
the  display  decreases  the  amount  of  available  workspace. 
Let  iv  be  the  total  sue  of  the  workspace  that  is  available  for 
use,  -  the  workspace  allocated  to  the  text  editor,  and  m  the 
space  allocated  for  a  menu:  »•»-«.  We  know  that 
C/(«)  •  am'  and  (/(»)»  *(tv  -•»)*,  where  t  «  2, »  «  Oh,  and 
D  •  e  -  on.  This  leads  to  the  tradeoff  functions  shown  in 
Figure  ]. 

Note  that  U  (editor  workspace)  is  relatively  insensitive  to 
U (mrrtM  use).  This  is  because  a  relatively  small  sized  display 
makes  a  satisfactory  menu,  whereas  it  requires  a  large 
display  to  make  a  satisfactory  editor  workspace.  As  a 
result,  changing  the  sue  of  the  menu  by  only  a  few  lines  can 
make  a  large  change  in  User  Satisfaction,  whereas  the  same 
change  m  workspace  is  usually  of  little  consequence. 

In  some  commercially  available  editors,  the  menu  of 
commands  can  occupy  approximately  half  the  screen  (usu¬ 
ally  2t  lines)  Figure  1  indicates  that  for  a  menu  of  12  lines 
i960  characters),  utmmu)  *  :00.  However,  from  Figures  2 
and  3  we  sec  that  with  a  workspace  of  only  1920  characters, 
a  menu  of  around  1000  characters  (or  of  !/(m»iw )  •  ;00) 
reduces  U( editor  workspace)  from  its  value  of  !0  with  no 
menu  to  W:  a  reduction  of  almost  one  third.  From  Figure  3 
we  see  that  we  would  be  much  less  impaired  by  the  same 
sue  menu  were  the  workspace  considerably  greater.  In  such 
cases,  we  have  a  clear  tradeoff  between  the  need  for  the 
menu  information  and  the  desire  to  have  a  reasonable 
workspace. 

Maximising  total  user  satisfaction  for  menu  and 
workspace.  When  tradeoff  functions  are  concave  down¬ 
ward  (as  m  Figure  3).  maximum  satisfaction  occurs  where 
the  slope  of  the  tradeoff  function  is  tangent  to  the  iso¬ 


satisfaction  function.  For  the  user  who  values  workspace 
and  menu  equally  (so  that  the  preferred  slope  is  -1),  the 
optimum  solution  is  to  operate  at  the  right  band  side  of  Fig¬ 
ure  3.  This  makes  for  a  relatively  high  value  of  user  satis¬ 
faction  for  the  menu  (which  means  a  large  menu  —  the 
exact  sizes  can  be  determined  from  Figure  1A)  —  but  with 
little  sacrifice  in  user  satisfaction  for  workspace  The  more 
expert  user  will  have  an  iso-satisfaction  function  with  a 
much  smaller  slope,  and  so  will  sacrifice  menu  for 
workspace.  Similarly,  the  beginner  will  have  an  iso- 
satisfactioo  curve  with  high  slope  which  will  maximize  menu 
size  at  the  expense  of  workspace.  These  results  are  quite 
unlike  the  tradeoffs  that  involved  time  in  which  an  all-or- 
none  section  was  optimal:  here,  the  optimum  values  are 
compromises  between  workspace  and  editor  size.  Display 
rate  and  amount  of  total  available  workspace  alter  the  point 
of  optimum  operation.  The  analysis  provides  exact  numeri¬ 
cal  determination  of  how  the  optimal  operating  point  is 
affected  by  these  variables 


U(Menu  Size) 

Figure  3.  The  tradeoff  of  Uier  Satisfaction  foe  menu  sac  afatost 
User  Satisfaction  for  editor  workspace,  foe  different  values  of  loial 
workspace.  IV .  HonzoetaJ  lines  represent  constant  values  of  U  (m)  and. 
teerefors.  of  ni  tbs  values  of  si  css  bs  determined  from  peeci  A  of  Fil¬ 
lies  1.  Similarly,  vcnical  lines  represent  cons  tail  -slues  of  w  ;  tbs  values 
of  w  can  bn  determined  from  pen  el  A  of  Figure  Z. 

A  Critique  of  (be  Tredeoff  Analysis 

There  are  a  number  of  problems  with  the  tradeoff  ana¬ 
lyses  presented  in  this  paper.  There  are  two  major  criti¬ 
cisms,  one  minor  one  Let  me  start  with  the  minor  one,  for 
it  represents  a  misunderstanding  that  would  be  good  to 
clear  up.  1  illustrate  :be  misunderstanding  for  the  variable 
of  menu  sue.  but  ibe  discussion  applies  to  other  variables  as 
well: 

•  The  functions  must  be  wrong:  Ulmenu  size)  con¬ 
tinually  increases  as  a  function  of  menu  sue.  yet 
when  the  sue  gets  too  large,  the  menu  becomes 
less  useful:  Ulmenu  size I  should  also  decrease. 


User  Sasisf  action  for  the  System  Is  the  Sum  of  Its  Parts 

This  misunderstanding  derives  from  confusing  User 
Satisfaction  for  a  single  attribute  with  User  Satisfaction  for  a 
system.  A  major  philosophy  of  the  tradeoffs  analysis  is  that 
a  system  can  be  decomposed  into  us  underlying  component 
attributes  and  User  Satisfaction  tor  each  assessed  individu¬ 
ally  The  User  Satisfaction  for  the  entire  system  can  only  be 
determined  from  the  combination  of  the  User  Satisf action 
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values  lor  each  of  in  components.  The  satisfaction  for  tbe 
amount  of  information  conveyed  by  tbe  menu  continues  to 
increase  with  size,  but  tbat  tbe  ability  to  find  something 
(captured  by  'search  time')  decreases  with  increasing  size  of 
a  menu.  Tbe  overall  satisfaction  for  tbe  menu  is  given  by 
tbe  sum  of  tbe  increasing  satisfaction  for  tbs  information 
and  tbe  decreasing  satisfaction  for  search  effort:  tbe  result 
is  a  U-shaped  curve  that  decreases  as  size  gets  too  big. 


Now  let  me  address  tbe  two  major  critiques: 

•  Tbe  tradeoff  functions  an  arbitrary, 

e  How  do  we  determine  tbe  fuactioos  when  we 
design?  It  would  be  more  useful  w ere  there  a  set 
of  standards  (perhaps  in  handbook  form); 


These  two  issues  point  to  unsolved  problems  with  the 
method.  My  defense  is  to  argue  tbat  this  procedure  it  new. 
Tbe  goal  is  to  introduce  tbe  philosophy  and  to  encourage 
others  to  help  in  the  collection  of  tbe  relevant  data  and  in 
tbe  development  of  the  method.  However,  the  numbers  and 
tbe  particular  functions  used  hen  may  be  useful,  for  The 
tasks  for  which  they  were  derived:  they  do  mesh  well  with 
my  intuitions. 

The  Tradeoff  Functions  Are  Arbitrary 

Although  the  functions  used  here  are  indeed  arbitrary, 
three  things  need  to  be  noted.  First,  power  functions  have 
a  long  tradition  of  satisfactory  use  in  psychology  and  so  are 
apt  to  be  good  approximations.  Second,  1  have  actually 
computed  User  Satisfaction  functions  using  tbe  logistic, 
power,  and  logarithmic  functions;  over  much  of  tbe  range 
of  interest,  the  results  differed  suprisingiy  little,  although  at 
high  data  rates,  tbe  concave  upward  tradeoff  functions 
became  concave  downward  when  tbe  logistic  was  used  for 
size  and  time,  although  at  low  dan  rates  they  were  still  con¬ 
cave  upward  (see  footnote  2).  Third,  I  agree  that  the  pre¬ 
ferred  thing  would  be  to  have  an  experimental  program  to 
determine  tbe  exact  forms  and  parameters  of  tbe  functions. 
In  particular,  tbe  ail-or-oooe  prediction  is  sensitive  to  the 
form  of  tbe  User  Satisfaction  functions. 

How  Do  Wc  Determine  the  Functions  When  We  Design ? 

Here,  again,  empirical  work  is  needed.  I  suspect  tbe 
functions  will  be  found  to  vary  only  for  a  reasonably  small 
number  of  classes  of  users,  classes  of  tasks,  and  design  attri¬ 
butes,  so  that  it  would  be  possible  to  collect  typical  values 
in  a  handbook.  Alternatively,  quick  data  collection 
methods  might  be  devised:  the  magnitude  estimation  pro¬ 
cedures  are  especially  easy  to  apply.  A  handbook  might  be 
quite  valuable.  Before  this  can  be  done,  of  course,  it  is 
necessary  to  determine  that  tbe  hypothesis  is  correct  —  that 
there  are  a  relatively  small  number  of  talks,  user  classes, 
and  tradeoff  variables  that  need  be  considered.  Moreover, 
one  must  extend  the  analysis  to  a  larger  domain  of  problems 
and  demonstrate  its  usefulness  in  actual  design. 


Some  Olber  Examples  of  Tradeoffs 

There  are  numerous  ocher  tradeoff  analyses  in  addi¬ 
tion  to  the  ones  presented  here  Three  other  situations 
seem  important  enough  to  warrant  consideration  here,  even 
though  they  arc  not  yet  ready  for  quantitative  treatment. 
These  ire:  (1)  the  comparison  between  command  languages 
and  menu-driven  systems;  (2)  bow  to  choose  names  for  com¬ 
mands  and  files;  and  (3),  the  tradeoffs  that  result  when 
moving  among  computer  systems  of  widely  varying  capabili¬ 
ties,  as  in  tbe  differences  berweeo  hand-held  computers  and 
powerful,  networked  workstations. 

Commend  Languages  Versus  Menus 

Tbe  relative  merits  of  menu-based  systems  and  com¬ 
mand  language  systems  are  often  debated,  seldom  with  any 
firm  conclusion.  It  is  useful  to  compare  tbeir  tradeoffs,  but 
before  we  do,  it  is  necessary  to  be  clear  about  what  is  meant 
by  each  alternative.  In  this  context  a  command  language 
system  is  one  in  which  no  aids  are  presented  to  the  user 
during  the  intention  or  choice  stages,  and  the  action 
specification  is  performed  by  typing  a  command,  using  the 
syntax  required  by  the  operating  system.  (The  distinctions 
among  the  intention,  choice,  and  specification  stages  come 
from  Norman,  1983b.)  Command  languages  are  the  most  fre¬ 
quent  method  of  implementing  operating  systems.  Similarly, 
in  tbit  context  a  menu-based  system  is  one  in  which  all  com¬ 
mands  am  presented  via  menus,  where  a  command  cannot 
be  specified  unless  it  is  currently  being  shown  on  the  active 
menu,  and  where  tbe  commands  are  specified  either  through 
short  names  or  single  characters  (as  indicated  by  the  menu 
items)  or  by  pointing  at  the  relevant  menu  item  (or  perhaps 
at  a  switch  or  mark  indicated  by  the  item).  These  are  res¬ 
tricted  interpretations  of  tbs  two  alternatives,  confounding 
issues  about  the  format  for  information  presentation  and 
action  specification.  Still,  because  they  represent  common 
design  alternatives,  it  is  useful  to  compare  them. 

Command  languages  offer  experts  great  versatility. 
Because  of  their  large  amount  of  knowledge  and  experience 
with  the  system,  experts  teed  to  know  exactly  what  opera¬ 
tions  they  wish  performed.  With  a  command  language  they 
can  specify  their  operations  directly  simply  by  typing  tbe 
names  of  the  commands,  as  well  ss  any  parameters,  files,  or 
other  system  options  that  are  required.  Command 
languages  make  it  easy  to  specify  parameters  (or  'flags')  to 
commands,  something  tbat  may  be  difficult  with  menu-based 
systems. 

Menus  offer  tbe  beginner  or  tbe  casual  user  consider¬ 
able  •  distance.  At  any  stage,  information  is  available. 
Even  abbreviated  menus  serve  as  a  reminder  of  tbe  alterna¬ 
tives.  Experts  often  complain  about  menu-based  systems 
because  of  tbe  time  penalty  in  requesting  a  menu,  waiting 
for  it  to  he  displayed,  and  then  searching  for  the  desired 
item.  Moreover,  systems  with  large  numbers  of  commands 
require  multiple  menus  tbat  slow  up  tbe  expert.  The  prob¬ 
lem  is  (bat  tbe  system  is  designed  to  give  help,  whether  or 
not  the  user  wishes  it. 
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Two  of  the  difficulties  with  menus  are  the  delay  in 
waiting  for  them  to  be  plotted  and  the  amouot  of  space  they 
occupy  Figure  ID  shows  that  the  tradeoff  between  amount 
of  information  and  time  delay  is  especially  sensitive  tc 
information  transmission  rate.  When  transmission  time 
becomes  fast  enough,  there  is  tittle  penalty  for  menus, 
whereas  at  slow  rates  of  data  transmission,  the  penalty  is 
high.  In  similar  fasbioo.  Figure  3  shows  that  the  tradeoff 
between  meou  size  and  workspace  is  especially  sensitive  to 
the  amount  of  total  workspace  available.  When  sufficient 
workspace  is  available,  there  is  little  penalty  for  menus. 
Thus,  slow  transmission  rates  and  small  workspaces  bias  the 
design  choice  toward  command  language  systems;  high  data 
rates  and  large  workspaces  bias  the  system  toward  menu- 
based  systems. 

The  two  systems  also  differ  in  the  kinds  of  errors  they 
lead  to  and  ease  of  error  correction.  In  a  command 
language  system,  an  error  in  command  specification  usually 
leads  to  an  illegal  command;  no  action  gets  performed.  This 
error  is  usually  easy  to  detect  and  to  correct.  In  a  menu- 
based  system,  an  error  in  specification  is  almost  always  a 
legal  command.  This  error  can  be  very  difficult  to  correct. 
If  the  action  was  subtle,  the  user  may  not  even  be  aware  ii 
was  performed.  If  the  action  was  dramatic,  the  user  will 
often  have  no  idea  of  what  precipitated  it,  since  the  action 
specification  was  unintentional. 

Some  of  the  tradeoffs  associated  with  menu-based  sys¬ 
tems  aod  command  language  systems  are  summarized  in 
Table  2.  Command  languages  tend  to  be  virtuous  for  the 
expert,  bur  difficult  for  the  novice;  they  are  difficult  to 
learn  and  there  are  no  on-line  reminders  of  the  set  of  possi¬ 
ble  actions.  .Menus  are  easy  to  use  and  they  provide  a  con¬ 
stant  reminder.  On  the  other  hand,  menus  tend  to  be  slow 
—  for  some  purposes,  the  expert  finds  them  tedious  and 
unwieldy  —  aod  not  as  flexible  as  command  languages. 

This  analysis  is  brief  and  restricted  to  the  particular 
formats  of  command  language  and  menu-based  systems  that 
were  described.  There  do  exist  techniques  for  mitigating 
the  deficiencies  of  each  system.  Nonetheless,  the  analysis  is 
useful,  both  for  pointing  out  the  nature  of  the  issues  and 
for  being  reasonably  faithful  to  some  existing  systems.  In 
the  argument  over  which  system  is  best,  the  answer  must  be 
that  neither  is:  each  has  its  virtues  aod  its  deficiencies. 

The  Choice  of  Somes  for  Commands  and  Files 

Another  example  of  a  common  tradeoff  is  in  the 
choice  of  name  for  a  command  or  a  file.  The  problem 
occurs  because  the  name  must  serve  two  different  purposes: 
as  a  description  of  the  item  and  alio  as  the  string  of  charac¬ 
ters  that  must  be  typed  to  invoke  it.  that  is,  as  the 
specification-,  these  two  uses  pose  conflicting  requirements. 

Consider  the  properties  ot  names  when  used  as 
descriptions.  The  more  complete  the  description,  the  more 
useful  it  can  be,  especially  when  the  user  is  unsure  of  the 
options  or  is  selecting  from  an  uofamiliar  set  of  alternatives. 
However,  the  longer  the  description,  the  more  space  it  occu¬ 


pies  and  the  more  difficult  to  read  or  scan  the  material.  In 
addition,  there  are  often  system  limitations  on  the  length 
and  format  of  names.  For  these  reasoas.  ooe  usually  settles 
for  a  partial  description,  counting  on  context  or  prior 
knowledge  to  allow  the  full  description  to  be  regenerated  by 
the  user. 

Table  2 


TRADEOFFS  BETWEEN  MENU-BASED  SYSTEMS 

and  command  language  systems 


ATTRIBUTE  MENU-BASED  COMMAND  LANGUAGE 


Spt*4  of  ;  Slow,  especially  i  Fx»J,  (or  experts;  operation 

if  large  or  if  hat  i  can  be  specified  exactly,  re- 
hierarchical  1  girdles*  of  system  state. 

'  structure- 

!  Prior  knowledge  '  Very  little  — 

i  required.  '  can  be  self- 

|  |  explanatory. 

1  ! 

Considerable  —  uaer  la  ex¬ 
pected  to  have  learned  set 
ot  alternative  actions  and 
command  language  that  ! 
specifies  them. 

1  Ease  of  learning-  '  High.  Uses 

recognition 
memory:  easier 
and  more  aceu- 
;  rare  than  recall 

I  memory.  Easy 
i  to  explore  sys¬ 
tem  usd  discov- 
1  '  er  options. 

Low.  Users  must  learn  ( 
names  and  syntax  of  1 
language.  If  alternatives 
uc  numerous,  learning 
may  lake  considerable 
time.  No  simple  way  10  ex¬ 
plore  system  sod  discover 
options  not  already  known. 

i  Errors:  Specification  er- 

1  |  for  Icadi  to 

i  i  inappropriate 

!  action:  difficult 

to  determine 

what  happened 

1  1  and  to  correct. 

Specification  error  usually 
leads  to  illegal  command: 
easy  to  detect,  easy  to  i 
correct. 

;  Most  useful  for:  I  Begmoer  or  in-  i  Expert  or  frequent  uaer. 

1  1  frequent  user.  > 

Once  the  appropriate  name  has  been  determined,  the 
user  enters  the  specification  stage  of  operation;  the  user 
must  specify  to  the  computer  system  which  name  is  desired. 
Most  users  are  not  expert  typists,  aod  so  it  is  desirable  to 
simplify  the  specification  stage.  As  a  result,  there  is  pres¬ 
sure  toward  the  use  of  short  names,  oftentimes  to  the  limit 
of  single  character  command  oames.  1 

The  desirability  for  short  names  is  primarily  a  factor 
when  specification  must  be  done  by  naming.  When  the 
specification  can  be  dooe  by  pointing,  then  ease  of  typing  is 
ao  longer  a  factor.  Nonetheless,  there  are  still  constraints 
on  the  name  choice:  the  longer  the  name,  the  easier  to  find 
and  point  at  the  desired  item,  but  at  the  cost  of  usiog  a 
larger  percentage  of  the  available  workspace,  of  increasing 
display  time,  and  the  ease  of  reading  and  search.  Now 
oames  might  wish  to  be  chosen  so  they  are  visually  distinct, 
or  so  that  they  occupy  appropriate  spatial  locations  on  the 
display,  in  all  cases  adding  mote  constraints  to  the  naming 


3.  A  of  mica*  ®l«i*  fof  tfiortcuu  in  ipcaActooe.  *o  (Am  mi  acod  oaIv 

uu  w  (feint  cftuKiin  ipiui  «mm  #«acapc*  or  '*«ld<4rd*  chutctcrl  to  ®*t«  «b« 
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M  Mfhcmi  to  0»tinpma  •<  uniquely  (to®  all  other  naacs  *.T»«  rvptot  aid  intro* 
Jucss  its  qoo  form  ot  ouun|  comtiaiot 
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problem.  la  general  (be  descriptive  requirement!  tend  to 
push  toward  longer  names,  names  (bat  provide  as  much 
information  as  possible.  Tbe  spec.aewioo  requirements 
tend  to  push  toward  shorter  names,  names  (bat  we  easy  to 
type. 

Handheld  Computers  Versus  Workstations 

New  developments  in  technology  we  moving  computer 
systems  in  several  conflicting  directions  simultaneously 
Workstations  we  getting  more  powerful,  with  large 
memories,  Iwge,  high  resolution  screens,  and  with  very  bigb 
communication  bandwidths.  These  developments  move  us 
towwd  tbe  ability  to  present  as  much  information  as  is 
needed  by  tbe  user  witb  little  penalty  in  time,  workspace,  or 
even  memory  space.  At  (be  same  time,  some  machines  are 
getting  smaller,  providing  us  witb  briefcase  sued  and  han¬ 
dheld  computers.  These  machines  have  great  virtue  because 
of  their  portability,  but  severe  limitations  in  communica¬ 
tions  speed,  memory  capacity,  and  amount  of  display  screen 
or  workspace. 

fust  as  workstations  we  s  or  tin  g  to  move  toward 
displays  capable  of  1000  line  resolution,  showing  several 
entire  pages  of  teat,  handheld  computers  move  us  back 
towwd  only  a  lew  short  lines  —  perhaps  8  lines  of  40  ctaw- 
inters  each  —  and  communication  rates  of  30  cps  (300 
baud).  Tbe  maior  differences  between  workstations  and 
handheld  computers  relevant  to  the  tradeoffs  discussion  we 
in  the  amount  of  memory,  processor  speed  aod  power,  com¬ 
munication  abilities,  availability  of  extra  peripherals,  and 
screes  sue:  in  ail  cases,  (be  handheld  machine  has  sacrificed 
power  lor  portability.  Because  the  same  people  may  wish  to 
use  both  handheld  machines  and  workstations  (one  while  at 
home  or  travelling,  (he  ocher  w  work),  the  person  may  wish 
the  same  programs  to  operate  on  the  two  machines.  How¬ 
ever.  tbe  interface  design  must  be  different,  as  tha  tradeoff 
analyses  of  this  paper  show. 

Summery  and  Conclusioas 

Tbe  tradeoff  analysts  is  intended  to  serve  as  an  exam¬ 
ple  of  a  quantitative  design  tool.  In  some  cases  it  may  not 
he  possible  to  select  an  optimum  design,  aot  even  for  a  res¬ 
tricted  class  of  activities  sad  users.  In  these  cases, 
knowledge  of  the  tradeoffs  allows  the  designer  to  choose 
intelligently,  knowing  exactly  what  benefits  and  limits  tbe 
system  design  will  provide.  Finally,  tbe  analyses  show  (bat 
some  design  decisions  are  beavtly  affected  by  technology, 
others  we  not.  Thus,  answers  to  design  questions  we 
heavily  context  dependent,  being  affected  by  the  classes  of 
users  for  whom  tbe  system  is  intended,  the  types  of  applica¬ 
tions  being  performed,  which  stages  of  user  activities  we 
thought  to  be  of  most  importance,  and  tbe  level  of  technol¬ 
ogy  being  employed. 

The  work  presented  here  is  just  the  beginning.  In  tbe 
ideal  case,  the  tradeoff  relationships  will  be  known  exactly, 
perhaps  with  the  relevant  quantitative  parameters  provided 
10  handbooks.  This  paper  has  limited  itself  to  demonstrat¬ 
ing  the  basic  principles.  Considerable  dtvelopment  must 
still  be  done  on  this  issue  and  on  tbe  other  major  parame¬ 


ters  and  issues  that  affect  tbe  quality  of  the  human-machine 
interaction.  Much  work  remains  to  be  done. 

A  second  point  of  the  paper  is  to  wgue  for  more  fuo- 
damcotal  approaches  to  the  study  of  humao-machioe 
interaction.  All  too  often  we  we  presented  with  minor  stu¬ 
dies  that  do  not  lead  to  general  application,  or  to  studies 
that  are  restricted  to  a  particular  technology  All  too  odea 
we  are  trapped  in  the  tw  pits  of  the  field  or  seduced  by  the 
sireos  of  technology  U  we  arc  to  have  a  science  of  design 
that  can  be  of  use  beyond  today's  local  problems,  we  must 
lews  to  broaden  our  views,  shwpeo  our  methods,  and  avoid 
lemptatioa. 

A  major  moral  of  this  paper  is  that  it  is  essential  to 
analyze  separately  the  different  aspects  of  human-computer 
interaction.  Detailed  analyses  of  each  aspen  of  the  human- 
computer  interface  we  essential,  of  course,  but  because 
design  decisions  interact  across  stages  and  daascs  of  users, 
we  must  also  develop  tools  that  allow  us  to  ask  for  what 
purpose  the  system  is  to  be  used,  theo  to  determine  bow 
best  to  accomplish  that  goal.  Only  after  the  global  decisions 
have  been  made  should  the  details  of  the  interface  design 
be  determined. 
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